Sure, will do, and will link there this discussion, just in case пн, 3 нояб. 2025 г. в 14:34, Em Rauch <[email protected]>:
> Do you mind actually opening a github issue about if we actually could > make cases like ParseFromString() not hit this case? > > It might be that this was already ruled out for some reason, but it does > seem to me like operations which effectively clear out the entire message > (including ParseFromString(), CopyFrom(), etc) should be able to reset the > arena to a fresh one automatically, and that would reduce the surface area > that an unbounded-growth case that you identify would be reachable. > > On Mon, Nov 3, 2025 at 8:29 AM Viktor Vorobev <[email protected]> wrote: > >> Great, now I see, thank you very much! >> It would be good to have it in the docs/examples as well, as it kind >> of feels counterintuitive that ParseFromString actually allocates more and >> more memory, while from the name of it feels like it shouldn't. >> >> Thanks again! >> >> пн, 3 нояб. 2025 г. в 14:21, Em Rauch <[email protected]>: >> >>> The behavior listed is working as intended when PythonProtobuf is backed >>> by upb. >>> >>> The reason why is because upb's memory model is implemented using an >>> Arena memory model (as described >>> https://en.wikipedia.org/wiki/Region-based_memory_management). Under >>> this model, you do no book-keeping on every individual allocation, instead >>> there's one pool that you can only append to, and the only time that memory >>> is freed is if that entire pool is released (because there's no >>> book-keeping about what fine-grained memory is live or not). This has both >>> less allocation and deallocation overhead, as well as less memory usage >>> from bookkeeping, by having everything be in the single blob of memory >>> which is much cheaper to drop. >>> >>> In the upb model, each new top level message is holding this pool and so >>> anything added has to stay live until that thing is released. >>> >>> This is a known tradeoff: for the expected usecases of Protobuf where >>> you have a lot of request-scoped messages (and some number of permanent >>> immutable constants), it will be faster and use less memory, with an >>> unavoidable the downside is that if you do have a long lived mutable object >>> that is doing allocating modifications, the memory won't be released until >>> you finally release that one object. >>> >>> In a pinch if you really need some long lived constantly-allocating >>> message, you can use CopyFrom() into a 'fresh' parent to basically reset it >>> so what you have in the arena is exactly only the data that is actually >>> reachable at that moment, at the cost of doing one deep copy of whatever >>> that state is. >>> >>> On Mon, Nov 3, 2025 at 7:59 AM Viktor Vorobev <[email protected]> wrote: >>> >>>> Hello! I think I've stumbled across an issue similar to this one: >>>> https://github.com/protocolbuffers/protobuf/issues/10088 >>>> >>>> But I'm not entirely sure, so seeking help. >>>> I've forked and hacked together a little demo repo: >>>> https://github.com/viktorvorobev/proto_leak >>>> >>>> But basically the situation is as follows. >>>> If you create a proto object, then use `ParseFromString` on it multiple >>>> times, then the object size grows uncontrollably until this object is >>>> deleted: >>>> >>>> obj = schema_pb2.value_test_topic() >>>> for _ in range(10_000_000): >>>> # here the object will grow >>>> obj.ParseFromString(b"...") >>>> del obj >>>> >>>> But if you recreate an object every time, then everything seems to be >>>> fine, so this code doesn't leak: >>>> >>>> for _ in range(10_000_000): >>>> obj = schema_pb2.value_test_topic() >>>> obj.ParseFromString(b"...") >>>> >>>> I suspect the same behaviour for Unpack, but didn't test it myself. >>>> >>>> Is this known or intended? >>>> >>>> Thanks a lot! >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Protocol Buffers" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion visit >>>> https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com >>>> <https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/protobuf/CAJEh-O4_7hgwd-RcJZbzDvv19qU%3Dr6CYKRbs8zATHYD4_h8t9w%40mail.gmail.com.
