Thanks for feedbacks.
I agree with what your points.I use vector<string> because it had already
been used on existing platform.

A newer test of comparing vector<int> has the following result:

It takes 1.9 seconds for boost to serilize vector<int> of size 10000000 !

It takes 4.71 seconds for boost to deserilize vector<int> of size 10000000 !

It takes 0.47 seconds for protocol-buffer to serilize  vector<int>of size
10000000 !

It takes 0.45 seconds for protocol-buffer to deserilize  vector<int> of size
10000000 !



Best


On Tue, Mar 31, 2009 at 2:07 AM, Kenton Varda <ken...@google.com> wrote:

> Several points:
>
> * Some of your test cases seem to be parsing from or serializing to files.
>  This may be measuring file I/O performance more than it is measuring the
> respective serialization libraries.  Even though you are using clock() to
> measure time, simply setting up file I/O operations involves syscalls and
> copying that could take some CPU time to execute.  Try parsing from and
> serializing to in-memory buffers instead.  For protocol buffers you should
> use ParseFromArray() and SerializeToArray() for maximum performance -- not
> sure if boost has equivalents.
>
> * Your test generates different random data for the boost test vs. the
> protobuf test.  For an accurate comparison, you really should use identical
> data.
>
> * Finally, your test isn't a very interesting test case for protocol
> buffers.  Parsing and serializing a lot of strings is going to be dominated
> by the performance of memcpy().  You might notice that the actual
> serialization step in your program takes much less time than even just
> populating the message object.  It might be more interesting to try
> serializing a message involving many different fields of different types.
>
>
> I think the reason parsing ends up being much slower than serialization for
> you is because it spends most of the time in malloc(), allocating strings.
>  There are a few things you can do about this:
>
> 1) Reuse the same message object every time you parse.  It will then reuse
> the same memory instead of allocating new memory.
>
> 2) Make sure you are not using a reference-counting string implementation.
>  They are, ironically, very slow, due to the need for atomic operations.
>
> 3) Use Google's tcmalloc in place of your system's malloc.  It is probably
> a lot faster.
>
> On Sun, Mar 29, 2009 at 9:32 PM, Yingfeng Zhang 
> <yingfeng.zh...@gmail.com>wrote:
>
>> Test files are attached
>>
>> Best
>>
>>
>>
>> On Mon, Mar 30, 2009 at 12:14 PM, Kenton Varda <ken...@google.com> wrote:
>>
>>> What does your .proto file look like?  And the code that uses it?
>>>
>>> On Sun, Mar 29, 2009 at 9:06 PM, Yingfeng <yingfeng.zh...@gmail.com>wrote:
>>>
>>>>
>>>> Hi,
>>>> We are looking for a fast mechanism for serialization/deserialization.
>>>> Here is our comparison between pb and boost:
>>>> We hope to serialize/deserialize data in std containers, such as:
>>>>
>>>> std::vector<std::string>
>>>>
>>>> Here is the data
>>>> 10000000 strings are stored in the vector
>>>>
>>>> as to boost:
>>>> Serialization: 3.8 s
>>>> Deserialization: 6.89 s
>>>>
>>>> as to protocol buffers:
>>>> Serialization: 4.59 s
>>>> Deserialization: 0.47 s
>>>>
>>>> It seems pb performs much bettern than boost in deserialization,
>>>> however it is even slower than boost in serialization. Could it be
>>>> improved for serialization to be as fast as deserialization?
>>>>
>>>>
>>>> >>>>
>>>>
>>>
>>
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to