Sure, will do, and will link there this discussion, just in case

пн, 3 нояб. 2025 г. в 14:34, Em Rauch <[email protected]>:

> Do you mind actually opening a github issue about if we actually could
> make cases like ParseFromString() not hit this case?
>
> It might be that this was already ruled out for some reason, but it does
> seem to me like operations which effectively clear out the entire message
> (including ParseFromString(), CopyFrom(), etc) should be able to reset the
> arena to a fresh one automatically, and that would reduce the surface area
> that an unbounded-growth case that you identify would be reachable.
>
> On Mon, Nov 3, 2025 at 8:29 AM Viktor Vorobev <[email protected]> wrote:
>
>> Great, now I see, thank you very much!
>> It would be good to have it in the docs/examples as well, as it kind
>> of feels counterintuitive that ParseFromString actually allocates more and
>> more memory, while from the name of it feels like it shouldn't.
>>
>> Thanks again!
>>
>> пн, 3 нояб. 2025 г. в 14:21, Em Rauch <[email protected]>:
>>
>>> The behavior listed is working as intended when PythonProtobuf is backed
>>> by upb.
>>>
>>> The reason why is because upb's memory model is implemented using an
>>> Arena memory model (as described
>>> https://en.wikipedia.org/wiki/Region-based_memory_management). Under
>>> this model, you do no book-keeping on every individual allocation, instead
>>> there's one pool that you can only append to, and the only time that memory
>>> is freed is if that entire pool is released (because there's no
>>> book-keeping about what fine-grained memory is live or not). This has both
>>> less allocation and deallocation overhead, as well as less memory usage
>>> from bookkeeping, by having everything be in the single blob of memory
>>> which is much cheaper to drop.
>>>
>>> In the upb model, each new top level message is holding this pool and so
>>> anything added has to stay live until that thing is released.
>>>
>>> This is a known tradeoff: for the expected usecases of Protobuf where
>>> you have a lot of request-scoped messages (and some number of permanent
>>> immutable constants), it will be faster and use less memory, with an
>>> unavoidable the downside is that if you do have a long lived mutable object
>>> that is doing allocating modifications, the memory won't be released until
>>> you finally release that one object.
>>>
>>> In a pinch if you really need some long lived constantly-allocating
>>> message, you can use CopyFrom() into a 'fresh' parent to basically reset it
>>> so what you have in the arena is exactly only the data that is actually
>>> reachable at that moment, at the cost of doing one deep copy of whatever
>>> that state is.
>>>
>>> On Mon, Nov 3, 2025 at 7:59 AM Viktor Vorobev <[email protected]> wrote:
>>>
>>>> Hello! I think I've stumbled across an issue similar to this one:
>>>> https://github.com/protocolbuffers/protobuf/issues/10088
>>>>
>>>> But I'm not entirely sure, so seeking help.
>>>> I've forked and hacked together a little demo repo:
>>>> https://github.com/viktorvorobev/proto_leak
>>>>
>>>> But basically the situation is as follows.
>>>> If you create a proto object, then use `ParseFromString` on it multiple
>>>> times, then the object size grows uncontrollably until this object is
>>>> deleted:
>>>>
>>>> obj = schema_pb2.value_test_topic()
>>>> for _ in range(10_000_000):
>>>>      # here the object will grow
>>>>     obj.ParseFromString(b"...")
>>>> del obj
>>>>
>>>> But if you recreate an object every time, then everything seems to be
>>>> fine, so this code doesn't leak:
>>>>
>>>> for _ in range(10_000_000):
>>>>     obj = schema_pb2.value_test_topic()
>>>>     obj.ParseFromString(b"...")
>>>>
>>>> I suspect the same behaviour for Unpack, but didn't test it myself.
>>>>
>>>> Is this known or intended?
>>>>
>>>> Thanks a lot!
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Protocol Buffers" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To view this discussion visit
>>>> https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/protobuf/CAJEh-O4_7hgwd-RcJZbzDvv19qU%3Dr6CYKRbs8zATHYD4_h8t9w%40mail.gmail.com.

Reply via email to