Do you mind actually opening a github issue about if we actually could make cases like ParseFromString() not hit this case?
It might be that this was already ruled out for some reason, but it does seem to me like operations which effectively clear out the entire message (including ParseFromString(), CopyFrom(), etc) should be able to reset the arena to a fresh one automatically, and that would reduce the surface area that an unbounded-growth case that you identify would be reachable. On Mon, Nov 3, 2025 at 8:29 AM Viktor Vorobev <[email protected]> wrote: > Great, now I see, thank you very much! > It would be good to have it in the docs/examples as well, as it kind > of feels counterintuitive that ParseFromString actually allocates more and > more memory, while from the name of it feels like it shouldn't. > > Thanks again! > > пн, 3 нояб. 2025 г. в 14:21, Em Rauch <[email protected]>: > >> The behavior listed is working as intended when PythonProtobuf is backed >> by upb. >> >> The reason why is because upb's memory model is implemented using an >> Arena memory model (as described >> https://en.wikipedia.org/wiki/Region-based_memory_management). Under >> this model, you do no book-keeping on every individual allocation, instead >> there's one pool that you can only append to, and the only time that memory >> is freed is if that entire pool is released (because there's no >> book-keeping about what fine-grained memory is live or not). This has both >> less allocation and deallocation overhead, as well as less memory usage >> from bookkeeping, by having everything be in the single blob of memory >> which is much cheaper to drop. >> >> In the upb model, each new top level message is holding this pool and so >> anything added has to stay live until that thing is released. >> >> This is a known tradeoff: for the expected usecases of Protobuf where you >> have a lot of request-scoped messages (and some number of permanent >> immutable constants), it will be faster and use less memory, with an >> unavoidable the downside is that if you do have a long lived mutable object >> that is doing allocating modifications, the memory won't be released until >> you finally release that one object. >> >> In a pinch if you really need some long lived constantly-allocating >> message, you can use CopyFrom() into a 'fresh' parent to basically reset it >> so what you have in the arena is exactly only the data that is actually >> reachable at that moment, at the cost of doing one deep copy of whatever >> that state is. >> >> On Mon, Nov 3, 2025 at 7:59 AM Viktor Vorobev <[email protected]> wrote: >> >>> Hello! I think I've stumbled across an issue similar to this one: >>> https://github.com/protocolbuffers/protobuf/issues/10088 >>> >>> But I'm not entirely sure, so seeking help. >>> I've forked and hacked together a little demo repo: >>> https://github.com/viktorvorobev/proto_leak >>> >>> But basically the situation is as follows. >>> If you create a proto object, then use `ParseFromString` on it multiple >>> times, then the object size grows uncontrollably until this object is >>> deleted: >>> >>> obj = schema_pb2.value_test_topic() >>> for _ in range(10_000_000): >>> # here the object will grow >>> obj.ParseFromString(b"...") >>> del obj >>> >>> But if you recreate an object every time, then everything seems to be >>> fine, so this code doesn't leak: >>> >>> for _ in range(10_000_000): >>> obj = schema_pb2.value_test_topic() >>> obj.ParseFromString(b"...") >>> >>> I suspect the same behaviour for Unpack, but didn't test it myself. >>> >>> Is this known or intended? >>> >>> Thanks a lot! >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Protocol Buffers" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion visit >>> https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com >>> <https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/protobuf/CAKRmVH9U%2BYw7HSodSVb_nivisdCgySssmTTX_VW%2BqP8J-YpE9Q%40mail.gmail.com.
