Is the advice in this thread, particularly with respect to generating C++ message implementations, still valid for modern versions of the Python protobuf runtime?
Brief spelunking through the Python codebase didn't yield a clear mechanism for how messages are automagically discovered. On Tuesday, December 7, 2010 at 10:49:43 PM UTC-8, Yang Zhang wrote: > > On Tue, Dec 7, 2010 at 9:40 PM, Kenton Varda <[email protected] > <javascript:>> wrote: > > On Tue, Dec 7, 2010 at 9:19 PM, Yang Zhang <[email protected] > <javascript:>> wrote: > >> > >> > Also, note that if you explicitly compile C++ versions of your > >> > messages and link them into the process, they'll be even faster. (If > >> > you > >> > don't, the library falls back to DynamicMessage which is not as fast > as > >> > generated code.) > >> > >> I'm trying to decipher that last hint, but having some trouble - what > >> exactly do you mean / how do I do that? I'm just using protoc > >> --py_out=... and PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp. > > > > I'm not completely sure what I mean, because I don't have much experience > > with Python C Extensions. Basically I'm saying you should additionally > > generate C++ code using protoc, the compile that into a C extension (even > > with no interface), and then load it into your Python process. Simply > > having the C++ code for your message types present will make them faster. > > Ah, my understanding now is that: > > - Python code ordinarily (without > PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp) uses pure Python > (generated code) to parse/serialize messages. > > - Python code *with* PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp) uses > generic C++ code that dynamically parses/serializes messages (via > DynamicMessage/reflection), as opposed to using any pre-generated C++ > code. > > - Python code with PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp actually > also *searches for the symbols for any pre-generated C++ code in the > current process*, and uses them if available instead of > DynamicMessage...? (This is via some global DescriptorPool magic?) > > Sounds like pretty weird behavior, but indeed, now I get even faster > processing. The following run shows ~68x and ~13x speedups vs. ~15x > and ~8x (my original speedup calculations were ~15x and ~8x, not ~12x > and ~7x...not sure how I got those, I probably was going off a > different set of measurements): > > $ PYTHONPATH=build/lib.linux-x86_64-2.6/:$PYTHONPATH > PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp python sandbox/pbbench.py > out.ini > noop: 1.6188621521e-07 > ser: 6.39575719833e-06 > parse: 4.55250144005e-05 > msg size: 10730 > > This was simple to do. I added a C extension to my setup.py: > > <<< > setup( > ... > ext_modules=[Extension('podpb', > sources=['cpp/podpb.c','cpp/main.pb.cc'], libraries=['protobuf'])], > ... > ) > >>> > > Generate the second source file with `protoc --cpp_out=cpp`, and > create the first one to set up an empty Python module: > > <<< > #include <Python.h> > > static PyMethodDef PodMethods[] = { > {NULL, NULL, 0, NULL} /* Sentinel */ > }; > > PyMODINIT_FUNC > initpodpb(void) > { > PyObject *m; > > m = Py_InitModule("podpb", PodMethods); > if (m == NULL) > return; > } > >>> > > Now `python setup.py build` should build everything. Just import the > module (podpb in our case) and you're good. > > Awesome tip, thanks Kenton. I foresee additions to the documentation > in protobuf's near future.... :) > -- > Yang Zhang > http://yz.mit.edu/ > > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/protobuf. For more options, visit https://groups.google.com/d/optout.
