I have found a segfault when using the CPP implementation from python in 2.4.1. I can reproduce it in two different environments with a small number of files.
The segfault is happening in google/protobuf/internal/cpp_message.py in the ScalarProperty getter. There seems to be some interplay between iterating through one repeated field and accessing a scalar property in another message. I have reduced the reproduction to a small set of .proto and py files. I have bundled the files and uploaded the whole set here: (http:// dl.dropbox.com/u/24148866/py_cpp_pbcrash.tgz). To reproduce the segfault run the crashtest.py script with the cpp protocol buffer implementation: PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp PYTHONPATH=./ python crashtest.py Simply accessing the self.pb.s1 property multiple times isn't enough to cause a segfault, nor is iterating over repeated astr field. I have made a change to the python_generator.cc file to include the package name in the python imports. I have included the file in the tarball and the diff is below. This modification was made to the 2.4.1 codebase. I believe that this should only change where python packages are imported from. I have tested this on two separate systems. Their respective configurations are below. They both experience the segfault after a different number of iterations through the "for a in uar.str:" loop. Adding imports like "from datetime import date" to the top of the script also changes the number of iterations through the loop before segfaulting. Any thoughts on what might be causing this or things I can do to help narrow down the root cause? Cheers, Stephen crashtest.py: from api_pb.ui_pb2 import UIR from api_pb.ua_pb2 import UAR from datetime import date import base64 activities_blob = ... large b64 blob ... class Container(object): def __init__(self): b64_user_blob = '''EnwKDFN0ZXBoZW5IYW1lchIHU3RlcGhlbhoFSGFtZXIyTmh0dHA6Ly9hMS50d2ltZy5jb20vc3RpY2t5L2RlZmF1bHRfcHJvZmlsZV9pbWFnZXMvZGVmYXVsdF9wcm9maWxlXzBfbm9ybWFsLnBuZzi9gafrBEIASgBSAGgB''' uir = UIR() uir.ParseFromString(base64.decodestring(b64_user_blob)) self.pb = uir.up def iterate_on_astr(self): uar = UAR() uar.ParseFromString(base64.decodestring(activities_blob)) for a in uar.astr: self.pb.s1 container = Container() container.iterate_on_astr() Tested environments: $ uname -a Linux isengard 3.1-pf #1 SMP PREEMPT Mon Jan 9 02:15:02 EST 2012 x86_64 Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz GenuineIntel GNU/Linux $ python2 --version Python 2.7.2 And shamer@prod5:~$ uname -a Linux prod5.upverter.com 2.6.35-24-virtual #42-Ubuntu SMP Thu Dec 2 05:15:26 UTC 2010 x86_64 GNU/Linux shamer@prod5:~$ python --version Python 2.6.6 Changes to python_generator.cc 79,80c79,80 < string ModuleName(const string& filename) { < string basename = StripProto(filename); --- > string ModuleName(const FileDescriptor *file) { > string basename = StripProto(file->name()); 83c83,89 < return basename + "_pb2"; --- > > string package = file->package(); > if (package.length() > 0) { > return package + "." + basename + "_pb2"; > } else { > return basename + "_pb2"; > } 245c251 < string module_name = ModuleName(file->name()); --- > string module_name = ModuleName(file); 286c292 < string module_name = ModuleName(file_->dependency(i)->name()); --- > string module_name = ModuleName(file_->dependency(i)); 950c956 < name = ModuleName(descriptor.file()->name()) + "." + name; --- > name = ModuleName(descriptor.file()) + "." + name; 962c968 < name = ModuleName(descriptor.file()->name()) + "." + name; --- > name = ModuleName(descriptor.file()) + "." + name; 975c981 < name = ModuleName(descriptor.file()->name()) + "." + name; --- > name = ModuleName(descriptor.file()) + "." + name; -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.