Thanks for those tips.  I am using tcmalloc, and I'm re-using message
for each batch, e.g. I fill it up with say 500 items, send it out, clear
it, re-use it.
 
Here are my hopefully accurate timings, each done 100 times, averaged:
 
1. Baseline (just loops through the data on the server) no protobuf:
191ms
2. Compose messages, serialize them, no I/O or deserialization: 213ms
3. Same as #2 but with IO to a dum java client: 265ms
4. Same as #3 but add java protobuf deserialization: 323ms
 
So from this it looks like:
- composing and serializing the messages takes 22ms
- sending the data over sockets takes 52ms
- deserializing the data in java with protobuf takes 58ms
 
The amount of data being sent is: 3,959,368 bytes in 158,045 messages
(composed in batches of 1000).
 
- Alex

________________________________

From: Kenton Varda [mailto:ken...@google.com] 
Sent: Tuesday, July 14, 2009 3:26 AM
To: Alex Black
Cc: Protocol Buffers
Subject: Re: Performance: Sending a message with ~150k items, approx
3.3mb, can I do better than 100ms?


OK.  If your message composition (or parsing, on the receiving end)
takes a lot of time, you might look into how much of that is due to
memory allocation.  Usually this is a pretty significant fraction.  Two
good ways to improve that: 

1) If your app builds many messages over time and most of them have
roughly the same "shape" (i.e. which fields are set, the size of
repeated fields, etc. are usually similar), then you should clear and
reuse the same message object rather than allocate a new one each time.
This way it will reuse the same memory, avoiding allocation.

2) Use tcmalloc:
  http://google-perftools.googlecode.com
It is often faster than your system's malloc, particularly for
multi-threaded C++ apps.  All C++ servers at Google use this.

On Mon, Jul 13, 2009 at 11:50 PM, Alex Black <a...@alexblack.ca> wrote:



        Kenton: I made a mistake with these numbers - pls ignore them -
I'll revisit tomorrow.
        
        Thx.
        

        -----Original Message-----
        From: protobuf@googlegroups.com
[mailto:proto...@googlegroups.com] On Behalf Of Alex Black
        Sent: Tuesday, July 14, 2009 2:05 AM
        To: Protocol Buffers
        Subject: Re: Performance: Sending a message with ~150k items,
approx 3.3mb, can I do better than 100ms?
        
        
        ok, I took I/O out of the picture by serializing each message
into a pre-allocated buffer, and this time I did a more through
measurement.
        
        Benchmark 1: Complete scenario
        - average time 262ms (100 runs)
        
        Benchmark 2: Same as # 1 but no IO
        - average time 250ms (100 runs)
        
        Benchmark 3: Same as 2 but with serialization commented out
        - average time 251ms (100 runs)
        
        Benchmark 4: Same as 3 but with message composition commented
out too (no protobuf calls)
        - average time 185 ms (100 runs)
        
        So from this I conclude:
        - My initial #s were wrong
        - My timings vary too much for each run to really get accurate
averages
        - IO takes about 10ms
        - Serialization takes ~0ms
        - Message composition and setting of fields takes ~66ms
        
        My message composition is in a loop, the part in the loop looks
like:
        
                               uuid_t relatedVertexId;
        
                               myProto::IdConfidence*
neighborIdConfidence = pNodeWithNeighbors-
        >add_neighbors();
        
                               // Set the vertex id
                               neighborIdConfidence->set_id((const
void*) relatedVertexId, 16);
                               // set the confidence
                               neighborIdConfidence->set_confidence(
confidence );
        
                               currentBatchSize++;
        
                               if ( currentBatchSize == BatchSize )
                               {
                                       // Flush out this batch
                                       //stream << getNeighborsResponse;
                                       getNeighborsResponse.Clear();
                                       currentBatchSize = 0;
                               }
        
        On Jul 14, 1:27 am, Kenton Varda <ken...@google.com> wrote:
        > Oh, I didn't even know you were including composition in
there.  My
        > benchmarks are only for serialization of already-composed
messages.
        > But this still doesn't tell us how much time is spent on
network I/O vs.
        > protobuf serialization.  My guess is that once you factor that
out,
        > your performance is pretty close to the benchmarks.
        >
        > On Mon, Jul 13, 2009 at 10:11 PM, Alex Black
<a...@alexblack.ca> wrote:
        >
        > > If I comment out the actual serialization and sending of the
message
        > > (so I am just composing messages, and clearing them each
batch) then
        > > the 100ms drops to about 50ms.
        >
        > > On Jul 14, 12:36 am, Alex Black <a...@alexblack.ca> wrote:
        > > > I'm sending a message with about ~150k repeated items in
it, total
        > > > size is about 3.3mb, and its taking me about 100ms to
serialize it
        > > > and send it out.
        >
        > > > Can I expect to do any better than this? What could I look
into to
        > > > improve this?
        > > > - I have "option optimize_for = SPEED;" set in my proto
file
        > > > - I'm compiling with -O3
        > > > - I'm sending my message in batches of 1000
        > > > - I'm using C++, on ubuntu, x64
        > > > - I'm testing all on one machine (e.g. client and server
are on
        > > > one
        > > > machine)
        >
        > > > My message looks like:
        >
        > > > message NodeWithNeighbors
        > > > {
        > > >         required Id nodeId = 1;
        > > >         repeated IdConfidence neighbors = 2;
        >
        > > > }
        >
        > > > message GetNeighborsResponse
        > > > {
        > > >         repeated NodeWithNeighbors nodesWithNeighbors = 1;
        >
        > > > }
        >
        > > > message IdConfidence
        > > > {
        > > >         required bytes id = 1;
        > > >         required float confidence = 2;
        >
        > > > }
        >
        > > > Where "bytes id" is used to send 16byte IDs (uuids).
        >
        > > > I'm writing each message (batch) out like this:
        >
        > > >         CodedOutputStream
codedOutputStream(&m_ProtoBufStream);
        >
        > > >         // Write out the size of the message
        > > >
codedOutputStream.WriteVarint32(message.ByteSize());
        > > >         // Ask the message to serialize itself to our
stream
        > > > adapter,
        > > which
        > > > ultimately calls Write on us
        > > >         // which we then call Write on our composed stream
        > > >
message.SerializeWithCachedSizes(&codedOutputStream);
        >
        > > > In my stream implementation I'm buffering every 16kb, and
calling
        > > > send on the socket once i have 16kb.
        >
        > > > Thanks!
        >
        > > > - Alex
        
        
                
        



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to