Re: [protobuf] Re: protobuffer suport composite object in stack not in heap?

2011-01-04 Thread Kenton Varda
On Tue, Jan 4, 2011 at 5:25 PM, Igor Gatis  wrote:

> A while ago, a colleague had a "memory leak" reusing a PB message which
> contained a repeated field. If I'm not mistaken the problem was that
> pb_message::Clear() calls vector::clear() and string::clear()
> which does not really release the memory allocated. I can't really tell for
> sure actually.
>
> @Kenton, does that make any sense? If yes, is there a way to avoid it?
>

As Evan says, this is by design.  The memory is not "leaked" -- it will be
reused when the message object is reused, and deleted when the message
object is deleted.

The actual problem that your colleague probably observed is that if you
happen to parse one message which is much larger than usual, the object will
allocate a bunch of memory for that one large message, and then will keep it
around even after parsing smaller messages.  So your memory usage is
determined by the largest message you parse, rather than by the average.

You can also run into problems if you have a message type whose instances
vary widely in "shape".  E.g. if type Foo has optional fields of type Bar
and Baz, and you parse one instance of Foo that contains a Bar, then reuse
the Foo to parse a message containing a Baz, then the Foo has allocated both
Bar and Baz and will hold on to them.  Thus the Foo is actually using more
memory than was needed for either of the two messages it parsed.

In practice these problems can manifest as memory usage that monotonically
increases over the life of the process, although the rate of increase slows
over time.

A way to avoid this problem is to call SpaceUsed() to find out how much
memory the object is using at any particular time.  Once it crosses some
threshold, delete the object and create a new one.  Another approach is to
reuse each objects at most N times -- this saves most of the allocation
costs while preventing memory usage from growing without bound.

Of course, all of this applies *only* to C++.  Java protobuf objects are not
reusable (since they are immutable), and in Python memory is discarded on
Clear().

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: protobuffer suport composite object in stack not in heap?

2011-01-04 Thread Evan Jones

On Jan 4, 2011, at 20:25 , Igor Gatis wrote:
A while ago, a colleague had a "memory leak" reusing a PB message  
which contained a repeated field. If I'm not mistaken the problem  
was that pb_message::Clear() calls vector::clear() and  
string::clear() which does not really release the memory allocated.  
I can't really tell for sure actually.


@Kenton, does that make any sense? If yes, is there a way to avoid it?


Yes, I have run into this same issue, when I occasionally read in a  
"huge" message. I think this is "by design." As Kenton noted: if you  
re-use the message, it never has to free / reallocate memory. See the  
"Optimization Tips" in this document:


http://code.google.com/apis/protocolbuffers/docs/cpptutorial.html

There is a ::SpaceUsed() method that can be helpful.

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: protobuffer suport composite object in stack not in heap?

2011-01-04 Thread Igor Gatis
On Tue, Jan 4, 2011 at 9:38 PM, Kenton Varda  wrote:

> I think this would be too complicated to integrate into the official C++
> implementation.  You could, however, write an alternative protobuf
> implementation that provides this.
>
> Note that with the official implementation, you can avoid malloc costs by
> reusing message objects.  A message object never frees any memory until the
> top-level object is destroyed, so if you reuse the object to parse multiple
> messages, you can avoid a lot of allocation costs after the first message.
>

A while ago, a colleague had a "memory leak" reusing a PB message which
contained a repeated field. If I'm not mistaken the problem was that
pb_message::Clear() calls vector::clear() and string::clear()
which does not really release the memory allocated. I can't really tell for
sure actually.

@Kenton, does that make any sense? If yes, is there a way to avoid it?


>
> You might also experiment with tcmalloc (part of the Google perftools
> package) to see if it is faster than your system's memory allocator.
>
> On Wed, Dec 29, 2010 at 10:21 PM, aristohuang  wrote:
>
>> eg.
>> message A {
>>string a; when set_a(), memory of a in heap(new/malloc)
>>int b; when set_b(), memory of b in stack
>> };
>>
>> if defines many composite sub-class objects, a lot of time cost in new/
>> malloc. are you think so? or have a good idear for this?
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To post to this group, send email to proto...@googlegroups.com.
> To unsubscribe from this group, send email to
> protobuf+unsubscr...@googlegroups.com
> .
> For more options, visit this group at
> http://groups.google.com/group/protobuf?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: protobuffer suport composite object in stack not in heap?

2011-01-04 Thread Kenton Varda
I think this would be too complicated to integrate into the official C++
implementation.  You could, however, write an alternative protobuf
implementation that provides this.

Note that with the official implementation, you can avoid malloc costs by
reusing message objects.  A message object never frees any memory until the
top-level object is destroyed, so if you reuse the object to parse multiple
messages, you can avoid a lot of allocation costs after the first message.

You might also experiment with tcmalloc (part of the Google perftools
package) to see if it is faster than your system's memory allocator.

On Wed, Dec 29, 2010 at 10:21 PM, aristohuang  wrote:

> eg.
> message A {
>string a; when set_a(), memory of a in heap(new/malloc)
>int b; when set_b(), memory of b in stack
> };
>
> if defines many composite sub-class objects, a lot of time cost in new/
> malloc. are you think so? or have a good idear for this?
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.