Great, now I see, thank you very much! It would be good to have it in the docs/examples as well, as it kind of feels counterintuitive that ParseFromString actually allocates more and more memory, while from the name of it feels like it shouldn't.
Thanks again! пн, 3 нояб. 2025 г. в 14:21, Em Rauch <[email protected]>: > The behavior listed is working as intended when PythonProtobuf is backed > by upb. > > The reason why is because upb's memory model is implemented using an Arena > memory model (as described > https://en.wikipedia.org/wiki/Region-based_memory_management). Under this > model, you do no book-keeping on every individual allocation, instead > there's one pool that you can only append to, and the only time that memory > is freed is if that entire pool is released (because there's no > book-keeping about what fine-grained memory is live or not). This has both > less allocation and deallocation overhead, as well as less memory usage > from bookkeeping, by having everything be in the single blob of memory > which is much cheaper to drop. > > In the upb model, each new top level message is holding this pool and so > anything added has to stay live until that thing is released. > > This is a known tradeoff: for the expected usecases of Protobuf where you > have a lot of request-scoped messages (and some number of permanent > immutable constants), it will be faster and use less memory, with an > unavoidable the downside is that if you do have a long lived mutable object > that is doing allocating modifications, the memory won't be released until > you finally release that one object. > > In a pinch if you really need some long lived constantly-allocating > message, you can use CopyFrom() into a 'fresh' parent to basically reset it > so what you have in the arena is exactly only the data that is actually > reachable at that moment, at the cost of doing one deep copy of whatever > that state is. > > On Mon, Nov 3, 2025 at 7:59 AM Viktor Vorobev <[email protected]> wrote: > >> Hello! I think I've stumbled across an issue similar to this one: >> https://github.com/protocolbuffers/protobuf/issues/10088 >> >> But I'm not entirely sure, so seeking help. >> I've forked and hacked together a little demo repo: >> https://github.com/viktorvorobev/proto_leak >> >> But basically the situation is as follows. >> If you create a proto object, then use `ParseFromString` on it multiple >> times, then the object size grows uncontrollably until this object is >> deleted: >> >> obj = schema_pb2.value_test_topic() >> for _ in range(10_000_000): >> # here the object will grow >> obj.ParseFromString(b"...") >> del obj >> >> But if you recreate an object every time, then everything seems to be >> fine, so this code doesn't leak: >> >> for _ in range(10_000_000): >> obj = schema_pb2.value_test_topic() >> obj.ParseFromString(b"...") >> >> I suspect the same behaviour for Unpack, but didn't test it myself. >> >> Is this known or intended? >> >> Thanks a lot! >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Protocol Buffers" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion visit >> https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com >> <https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/protobuf/CAJEh-O40f24HQ49DKcO3b5K7ymQ4mztZ6cXd6Sh_BigR6KB_Xw%40mail.gmail.com.
