I'll try to come up with a sample tomorrow, but the surrounding code is pretty complex, so I'm not 100% sure it will still exhibit the same pattern if I do the same thing in a stripped application.
As an alternative to not clearing the items before I put them back in the list, would there be any problem with storing my own list of buffers internally, and then calling AddAllocated() a bunch of times while building the message stream and then ReleaseLast() at the end until all the messages are clear? What I really want is a way to just give it a raw memory buffer, tell it how big the buffer is, and then have it just store a pointer to the buffer. Then there's no strings, no copying, etc. It's currently somewhat awkard, because my sequence goes like this: 1) Read some data from the disk into a buffer 2) Put that data into a proto buf message. 3) Repeat this a number of times, putting each chunk of data into a new message 4) Serialize the new message, which contains a list of chunks into an array. 5) Call socket.write() with the serialized array. But that's 3 copies. There's my original buffer that i read from the disk into, protobuf's message buffer where it stores internally as a string, and the final buffer that I serialize into so that I can send it across the wire. It would be nice if I could get rid of all this copying. On Thu, Mar 5, 2009 at 6:24 PM, Kenton Varda <[email protected]> wrote: > On Thu, Mar 5, 2009 at 4:02 PM, Zachary Turner <[email protected]>wrote: > >> I get somewhat better results with that flag. I built protobuf with >> profiling enabled and I'm a little suspicious that the information is 100% >> accurate, but it seems like std::string::clear() takes up the most time. >> But the percentages don't match up to what I calculate, so I'm not sure >> where the inconsistency is. > > > Can you write a small example program demonstrating the problem which I can > play with? > > What STL implementation are you using? (I.e. what compiler?) > > >> Just out of curiosity, is there even any need for me to call Clear()? I'm >> filling out every single field every single time, and always using >> mutable_data()->assign() to copy the data into the message, so is it fine to >> just leave it "uncleared" but still stick it back into the cleared list? > > > Technically it might work, but if it does I can't guarantee that it > wouldn't break in the future. > > >> >> >> On Thu, Mar 5, 2009 at 5:25 PM, Kenton Varda <[email protected]> wrote: >> >>> Add this to your .proto file: >>> option optimize_for = SPEED; >>> >>> Does it help? >>> >>> On Thu, Mar 5, 2009 at 3:23 PM, Zachary Turner >>> <[email protected]>wrote: >>> >>>> >>>> I'll give it a try. I haven't built the protobuf libraries with >>>> instrumenting support or else I'd already know, but I should be able >>>> to get it working. >>>> >>>> On Mar 5, 5:20 pm, Kenton Varda <[email protected]> wrote: >>>> > Wow, that's interesting. I don't know why it would do that. Can you >>>> look >>>> > deeper into your profiles and see what part of Clear() is taking so >>>> long? >>>> > For example, is it spending the time clearing STL strings? >>>> > >>>> > On Thu, Mar 5, 2009 at 3:11 PM, Zachary Turner < >>>> [email protected]>wrote: >>>> >>>> > >>>> > >>>> > >>>> > > I have a fairly old version of the protobuf library, so if this has >>>> > > been changed let me know, but I have a situation where >>>> Message::Clear >>>> > > () is causing my cpu to go to like 70% for an extended period of >>>> time. >>>> > >>>> > > It's also possible this is user error, so please correct me if >>>> that's >>>> > > the case. >>>> > >>>> > > Basically what I have is a top level message with a bunch of >>>> optional >>>> > > messages, which I send across the wire. >>>> > >>>> > > One of these optional messages is defined as follows: >>>> > >>>> > > message DataChunkList { >>>> > > required bool is_end_of_list = 1; >>>> > > repeated DataChunk data = 2; >>>> > > }; >>>> > >>>> > > message DataChunk { >>>> > > optional bytes data = 1; >>>> > > //Other fields here >>>> > > }; >>>> > >>>> > > The "data" field will almost always be exactly 4k, and I will >>>> usually >>>> > > not want to send 1 chunk at a time, but a list of around 32 at a >>>> > > time. >>>> > >>>> > > So I save an instance of the top level message in the class >>>> containing >>>> > > my sending code, and right before I'm about to send data I do the >>>> > > following: >>>> > >>>> > > net::DataChunkList* pChunks = >>>> m_CachedTopLevel.mutable_data_chunk_list >>>> > > (); >>>> > >>>> > > //Should already be clear, but just in case >>>> > > pChunks->Clear(); >>>> > > prevCount = pChunks->mutable_data()->ClearedCount(); >>>> > >>>> > > for (int i=prevCount; i < num_chunks; ++i) >>>> > > { >>>> > > net::DataChunk* pChunk = new net::DataChunk(); >>>> > > pChunk->mutable_data()->reserve(4096); >>>> > > pChunkList->mutable_data()->AddCleared(pChunk); >>>> > > } >>>> > >>>> > > for (int i=0; i < num_chunks; ++i) >>>> > > { >>>> > > net::DataChunk* pChunk = >>>> pChunks->mutable_data()->ReleaseCleared(); >>>> > > pChunk->mutable_data()->assign(global_4k_buffer, 4096); >>>> > > pChunks->mutable_data()->AddAllocated(pChunks); >>>> > > } >>>> > >>>> > > send(m_CachedTopLevel); >>>> > >>>> > > m_CachedTopLevel.Clear(); >>>> > >>>> > > I ran a profiler on my code, and the very last line (the Clear()) >>>> > > takes up almost 95% of the CPU usage for the function, and the >>>> > > function takes up about about 30% of the CPU usage of the entire >>>> app. >>>> > > So obviously this is a big problem. >>>> > >>>> > > The comment on the code says that clear "does not free any memory" >>>> > > however. So why could it be using so much CPU? Am I >>>> misunderstanding >>>> > > the purpose / usage of these methods? What I'm trying to do is just >>>> > > re-use a pool of 4k buffers for all of these sends. >>>> >>>> >>>> >>> >> > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~----------~----~----~----~------~----~------~--~---
