Re: Clear() function excessive CPU usage

Kenton Varda Thu, 05 Mar 2009 17:12:31 -0800

On Thu, Mar 5, 2009 at 4:42 PM, Zachary Turner <[email protected]>wrote:


> I'll try to come up with a sample tomorrow, but the surrounding code is
> pretty complex, so I'm not 100% sure it will still exhibit the same pattern
> if I do the same thing in a stripped application.
>
> As an alternative to not clearing the items before I put them back in the
> list, would there be any problem with storing my own list of buffers
> internally, and then calling AddAllocated() a bunch of times while building
> the message stream and then ReleaseLast() at the end until all the messages
> are clear?


That would be fine.


>   What I really want is a way to just give it a raw memory buffer, tell it
> how big the buffer is, and then have it just store a pointer to the buffer.
> Then there's no strings, no copying, etc.   It's currently somewhat awkard,
> because my sequence goes like this:
>
> 1) Read some data from the disk into a buffer
> 2) Put that data into a proto buf message.
> 3) Repeat this a number of times, putting each chunk of data into a new
> message
> 4) Serialize the new message, which contains a list of chunks into an
> array.
> 5) Call socket.write() with the serialized array.
>
> But that's 3 copies.  There's my original buffer that i read from the disk
> into, protobuf's message buffer where it stores internally as a string, and
> the final buffer that I serialize into so that I can send it across the
> wire.  It would be nice if I could get rid of all this copying.


Yeah, the implementation wasn't really designed for this sort of usage.  :/


>
>
>
> On Thu, Mar 5, 2009 at 6:24 PM, Kenton Varda <[email protected]> wrote:
>
>> On Thu, Mar 5, 2009 at 4:02 PM, Zachary Turner 
>> <[email protected]>wrote:
>>
>>> I get somewhat better results with that flag.  I built protobuf with
>>> profiling enabled and I'm a little suspicious that the information is 100%
>>> accurate,  but it seems like std::string::clear() takes up the most time.
>>> But the percentages don't match up to what I calculate, so I'm not sure
>>> where the inconsistency is.
>>
>>
>> Can you write a small example program demonstrating the problem which I
>> can play with?
>>
>> What STL implementation are you using?  (I.e. what compiler?)
>>
>>
>>> Just out of curiosity, is there even any need for me to call Clear()?
>>> I'm filling out every single field every single time, and always using
>>> mutable_data()->assign() to copy the data into the message, so is it fine to
>>> just leave it "uncleared" but still stick it back into the cleared list?
>>
>>
>> Technically it might work, but if it does I can't guarantee that it
>> wouldn't break in the future.
>>
>>
>>>
>>>
>>> On Thu, Mar 5, 2009 at 5:25 PM, Kenton Varda <[email protected]> wrote:
>>>
>>>> Add this to your .proto file:
>>>>   option optimize_for = SPEED;
>>>>
>>>> Does it help?
>>>>
>>>> On Thu, Mar 5, 2009 at 3:23 PM, Zachary Turner <[email protected]
>>>> > wrote:
>>>>
>>>>>
>>>>> I'll give it a try.  I haven't built the protobuf libraries with
>>>>> instrumenting support or else I'd already know, but I should be able
>>>>> to get it working.
>>>>>
>>>>> On Mar 5, 5:20 pm, Kenton Varda <[email protected]> wrote:
>>>>> > Wow, that's interesting.  I don't know why it would do that.  Can you
>>>>> look
>>>>> > deeper into your profiles and see what part of Clear() is taking so
>>>>> long?
>>>>> >  For example, is it spending the time clearing STL strings?
>>>>> >
>>>>> > On Thu, Mar 5, 2009 at 3:11 PM, Zachary Turner <
>>>>> [email protected]>wrote:
>>>>>
>>>>> >
>>>>> >
>>>>> >
>>>>> > > I have a fairly old version of the protobuf library, so if this has
>>>>> > > been changed let me know, but I have a situation where
>>>>> Message::Clear
>>>>> > > () is causing my cpu to go to like 70% for an extended period of
>>>>> time.
>>>>> >
>>>>> > > It's also possible this is user error, so please correct me if
>>>>> that's
>>>>> > > the case.
>>>>> >
>>>>> > > Basically what I have is a top level message with a bunch of
>>>>> optional
>>>>> > > messages, which I send across the wire.
>>>>> >
>>>>> > > One of these optional messages is defined as follows:
>>>>> >
>>>>> > > message DataChunkList {
>>>>> > >    required bool             is_end_of_list = 1;
>>>>> > >    repeated DataChunk  data = 2;
>>>>> > > };
>>>>> >
>>>>> > > message DataChunk {
>>>>> > >    optional bytes    data = 1;
>>>>> > >    //Other fields here
>>>>> > > };
>>>>> >
>>>>> > > The "data" field will almost always be exactly 4k, and I will
>>>>> usually
>>>>> > > not want to send 1 chunk at a time, but a list of around 32 at a
>>>>> > > time.
>>>>> >
>>>>> > > So I save an instance of the top level message in the class
>>>>> containing
>>>>> > > my sending code, and right before I'm about to send data I do the
>>>>> > > following:
>>>>> >
>>>>> > > net::DataChunkList* pChunks =
>>>>> m_CachedTopLevel.mutable_data_chunk_list
>>>>> > > ();
>>>>> >
>>>>> > > //Should already be clear, but just in case
>>>>> > > pChunks->Clear();
>>>>> > > prevCount = pChunks->mutable_data()->ClearedCount();
>>>>> >
>>>>> > > for (int i=prevCount; i < num_chunks; ++i)
>>>>> > > {
>>>>> > >    net::DataChunk* pChunk = new net::DataChunk();
>>>>> > >    pChunk->mutable_data()->reserve(4096);
>>>>> > >    pChunkList->mutable_data()->AddCleared(pChunk);
>>>>> > > }
>>>>> >
>>>>> > > for (int i=0; i < num_chunks; ++i)
>>>>> > > {
>>>>> > >   net::DataChunk* pChunk =
>>>>> pChunks->mutable_data()->ReleaseCleared();
>>>>> > >   pChunk->mutable_data()->assign(global_4k_buffer, 4096);
>>>>> > >   pChunks->mutable_data()->AddAllocated(pChunks);
>>>>> > > }
>>>>> >
>>>>> > > send(m_CachedTopLevel);
>>>>> >
>>>>> > > m_CachedTopLevel.Clear();
>>>>> >
>>>>> > > I ran a profiler on my code, and the very last line  (the Clear())
>>>>> > > takes up almost 95% of the CPU usage for the function, and the
>>>>> > > function takes up about about 30% of the CPU usage of the entire
>>>>> app.
>>>>> > > So obviously this is a big problem.
>>>>> >
>>>>> > > The comment on the code says that clear "does not free any memory"
>>>>> > > however.  So why could it be using so much CPU?  Am I
>>>>> misunderstanding
>>>>> > > the purpose / usage of these methods?  What I'm trying to do is
>>>>> just
>>>>> > > re-use a pool of 4k buffers for all of these sends.
>>>>> >>>>>
>>>>>
>>>>
>>>
>>
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Clear() function excessive CPU usage

Reply via email to