The behavior listed is working as intended when PythonProtobuf is backed by
upb.

The reason why is because upb's memory model is implemented using an Arena
memory model (as described
https://en.wikipedia.org/wiki/Region-based_memory_management). Under this
model, you do no book-keeping on every individual allocation, instead
there's one pool that you can only append to, and the only time that memory
is freed is if that entire pool is released (because there's no
book-keeping about what fine-grained memory is live or not). This has both
less allocation and deallocation overhead, as well as less memory usage
from bookkeeping, by having everything be in the single blob of memory
which is much cheaper to drop.

In the upb model, each new top level message is holding this pool and so
anything added has to stay live until that thing is released.

This is a known tradeoff: for the expected usecases of Protobuf where you
have a lot of request-scoped messages (and some number of permanent
immutable constants), it will be faster and use less memory, with an
unavoidable the downside is that if you do have a long lived mutable object
that is doing allocating modifications, the memory won't be released until
you finally release that one object.

In a pinch if you really need some long lived constantly-allocating
message, you can use CopyFrom() into a 'fresh' parent to basically reset it
so what you have in the arena is exactly only the data that is actually
reachable at that moment, at the cost of doing one deep copy of whatever
that state is.

On Mon, Nov 3, 2025 at 7:59 AM Viktor Vorobev <[email protected]> wrote:

> Hello! I think I've stumbled across an issue similar to this one:
> https://github.com/protocolbuffers/protobuf/issues/10088
>
> But I'm not entirely sure, so seeking help.
> I've forked and hacked together a little demo repo:
> https://github.com/viktorvorobev/proto_leak
>
> But basically the situation is as follows.
> If you create a proto object, then use `ParseFromString` on it multiple
> times, then the object size grows uncontrollably until this object is
> deleted:
>
> obj = schema_pb2.value_test_topic()
> for _ in range(10_000_000):
>      # here the object will grow
>     obj.ParseFromString(b"...")
> del obj
>
> But if you recreate an object every time, then everything seems to be
> fine, so this code doesn't leak:
>
> for _ in range(10_000_000):
>     obj = schema_pb2.value_test_topic()
>     obj.ParseFromString(b"...")
>
> I suspect the same behaviour for Unpack, but didn't test it myself.
>
> Is this known or intended?
>
> Thanks a lot!
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion visit
> https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com
> <https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/protobuf/CAKRmVH-C%2B2CgDajSoBsDxmZ0rqp9dzKTANLWSShiZ%3DYHk1jMqw%40mail.gmail.com.

Reply via email to