Here is the link to the created issue for the newer generations: https://github.com/protocolbuffers/protobuf/issues/24257
(I just hate to leave such discussions without a trace) понедельник, 3 ноября 2025 г. в 14:44:31 UTC+1, Viktor Vorobev: > Sure, will do, and will link there this discussion, just in case > > пн, 3 нояб. 2025 г. в 14:34, Em Rauch <[email protected]>: > >> Do you mind actually opening a github issue about if we actually could >> make cases like ParseFromString() not hit this case? >> >> It might be that this was already ruled out for some reason, but it does >> seem to me like operations which effectively clear out the entire message >> (including ParseFromString(), CopyFrom(), etc) should be able to reset the >> arena to a fresh one automatically, and that would reduce the surface area >> that an unbounded-growth case that you identify would be reachable. >> >> On Mon, Nov 3, 2025 at 8:29 AM Viktor Vorobev <[email protected]> wrote: >> >>> Great, now I see, thank you very much! >>> It would be good to have it in the docs/examples as well, as it kind >>> of feels counterintuitive that ParseFromString actually allocates more and >>> more memory, while from the name of it feels like it shouldn't. >>> >>> Thanks again! >>> >>> пн, 3 нояб. 2025 г. в 14:21, Em Rauch <[email protected]>: >>> >>>> The behavior listed is working as intended when PythonProtobuf is >>>> backed by upb. >>>> >>>> The reason why is because upb's memory model is implemented using an >>>> Arena memory model (as described >>>> https://en.wikipedia.org/wiki/Region-based_memory_management). Under >>>> this model, you do no book-keeping on every individual allocation, instead >>>> there's one pool that you can only append to, and the only time that >>>> memory >>>> is freed is if that entire pool is released (because there's no >>>> book-keeping about what fine-grained memory is live or not). This has both >>>> less allocation and deallocation overhead, as well as less memory usage >>>> from bookkeeping, by having everything be in the single blob of memory >>>> which is much cheaper to drop. >>>> >>>> In the upb model, each new top level message is holding this pool and >>>> so anything added has to stay live until that thing is released. >>>> >>>> This is a known tradeoff: for the expected usecases of Protobuf where >>>> you have a lot of request-scoped messages (and some number of permanent >>>> immutable constants), it will be faster and use less memory, with an >>>> unavoidable the downside is that if you do have a long lived mutable >>>> object >>>> that is doing allocating modifications, the memory won't be released until >>>> you finally release that one object. >>>> >>>> In a pinch if you really need some long lived constantly-allocating >>>> message, you can use CopyFrom() into a 'fresh' parent to basically reset >>>> it >>>> so what you have in the arena is exactly only the data that is actually >>>> reachable at that moment, at the cost of doing one deep copy of whatever >>>> that state is. >>>> >>>> On Mon, Nov 3, 2025 at 7:59 AM Viktor Vorobev <[email protected]> wrote: >>>> >>>>> Hello! I think I've stumbled across an issue similar to this one: >>>>> https://github.com/protocolbuffers/protobuf/issues/10088 >>>>> >>>>> But I'm not entirely sure, so seeking help. >>>>> I've forked and hacked together a little demo repo: >>>>> https://github.com/viktorvorobev/proto_leak >>>>> >>>>> But basically the situation is as follows. >>>>> If you create a proto object, then use `ParseFromString` on it >>>>> multiple times, then the object size grows uncontrollably until this >>>>> object >>>>> is deleted: >>>>> >>>>> obj = schema_pb2.value_test_topic() >>>>> for _ in range(10_000_000): >>>>> # here the object will grow >>>>> obj.ParseFromString(b"...") >>>>> del obj >>>>> >>>>> But if you recreate an object every time, then everything seems to be >>>>> fine, so this code doesn't leak: >>>>> >>>>> for _ in range(10_000_000): >>>>> obj = schema_pb2.value_test_topic() >>>>> obj.ParseFromString(b"...") >>>>> >>>>> I suspect the same behaviour for Unpack, but didn't test it myself. >>>>> >>>>> Is this known or intended? >>>>> >>>>> Thanks a lot! >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Protocol Buffers" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion visit >>>>> https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/protobuf/b716b0a0-af12-40c0-8a21-27618ace8d3en%40googlegroups.com.
