On Sun, Mar 1, 2009 at 3:24 PM, Kenton Varda <ken...@google.com> wrote:

> On Sun, Mar 1, 2009 at 2:12 PM, Joshua Haberman <jhaber...@gmail.com>wrote:
>> Stripping down protobufs to their essence is *exactly* what I am doing
>> with pbstream:
>> http://github.com/haberman/pbstream
>> If you can hang tight for just another few weeks, I think you're going
>> to like what you see.  The streaming decoder is more or less finished:
>> it's just over 500 lines of C99 and compiles to a ~6k object file.
>> I'm working now on an in-memory representation, which should be also
>> quite small.  And while I don't have any benchmarks yet, I think it's
>> going to be extremely fast, despite not doing any code generation.  In
>> the case where you're not actually copying the data into a separate
>> structure (eg. pure streaming) I think it will likely beat the main
>> implementation (malloc is more expensive than people realize).
> Note that the C++ implementation only mallocs the *first* time you use an
> object.  If you reuse an object, it will reuse memory, only allocating new
> memory for parts of the message that weren't used before.

What about strings and repeated elements?  String assignment will at least
sometimes call malloc() to re-allocate the string, though I don't know
enough about STL to know how often this happens.  And string assignment will
always copy the data at least, whereas with pbstream that data is never
copied unless the client decides it wants to store it.

>   In practice, most performance-critical software reuses protobuf objects,
> so it's only fair to compare your implementation against this case.

As long as you think it's fair.  :)

> Be careful about claiming that your code will be faster before actually
> running benchmarks.  I've been burned many times doing that.  :)

I said "likely."  :)  I probably am a bit overconfident, because last time I
told someone that I'd beat their data-streaming tool by 20x by not calling
malloc() in the critical path, I hit 20x almost spot-on.  proto2 is a lot
more optimized than that application was, but I definitely think that never
calling malloc() in the critical path, as well as never copying string data,
will likely yield compelling performance.

It would be nice if there was an open-source benchmark suite that would make
it easier to compare performance across implementations.


You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
For more options, visit this group at 

Reply via email to