Here is the link to the created issue for the newer 
generations: https://github.com/protocolbuffers/protobuf/issues/24257

(I just hate to leave such discussions without a trace)

понедельник, 3 ноября 2025 г. в 14:44:31 UTC+1, Viktor Vorobev: 

> Sure, will do, and will link there this discussion, just in case
>
> пн, 3 нояб. 2025 г. в 14:34, Em Rauch <[email protected]>:
>
>> Do you mind actually opening a github issue about if we actually could 
>> make cases like ParseFromString() not hit this case?
>>
>> It might be that this was already ruled out for some reason, but it does 
>> seem to me like operations which effectively clear out the entire message 
>> (including ParseFromString(), CopyFrom(), etc) should be able to reset the 
>> arena to a fresh one automatically, and that would reduce the surface area 
>> that an unbounded-growth case that you identify would be reachable.
>>
>> On Mon, Nov 3, 2025 at 8:29 AM Viktor Vorobev <[email protected]> wrote:
>>
>>> Great, now I see, thank you very much!
>>> It would be good to have it in the docs/examples as well, as it kind 
>>> of feels counterintuitive that ParseFromString actually allocates more and 
>>> more memory, while from the name of it feels like it shouldn't.
>>>
>>> Thanks again!
>>>
>>> пн, 3 нояб. 2025 г. в 14:21, Em Rauch <[email protected]>:
>>>
>>>> The behavior listed is working as intended when PythonProtobuf is 
>>>> backed by upb.
>>>>
>>>> The reason why is because upb's memory model is implemented using an 
>>>> Arena memory model (as described 
>>>> https://en.wikipedia.org/wiki/Region-based_memory_management). Under 
>>>> this model, you do no book-keeping on every individual allocation, instead 
>>>> there's one pool that you can only append to, and the only time that 
>>>> memory 
>>>> is freed is if that entire pool is released (because there's no 
>>>> book-keeping about what fine-grained memory is live or not). This has both 
>>>> less allocation and deallocation overhead, as well as less memory usage 
>>>> from bookkeeping, by having everything be in the single blob of memory 
>>>> which is much cheaper to drop.
>>>>
>>>> In the upb model, each new top level message is holding this pool and 
>>>> so anything added has to stay live until that thing is released.
>>>>
>>>> This is a known tradeoff: for the expected usecases of Protobuf where 
>>>> you have a lot of request-scoped messages (and some number of permanent 
>>>> immutable constants), it will be faster and use less memory, with an 
>>>> unavoidable the downside is that if you do have a long lived mutable 
>>>> object 
>>>> that is doing allocating modifications, the memory won't be released until 
>>>> you finally release that one object.
>>>>
>>>> In a pinch if you really need some long lived constantly-allocating 
>>>> message, you can use CopyFrom() into a 'fresh' parent to basically reset 
>>>> it 
>>>> so what you have in the arena is exactly only the data that is actually 
>>>> reachable at that moment, at the cost of doing one deep copy of whatever 
>>>> that state is.
>>>>
>>>> On Mon, Nov 3, 2025 at 7:59 AM Viktor Vorobev <[email protected]> wrote:
>>>>
>>>>> Hello! I think I've stumbled across an issue similar to this one:
>>>>> https://github.com/protocolbuffers/protobuf/issues/10088
>>>>>
>>>>> But I'm not entirely sure, so seeking help.
>>>>> I've forked and hacked together a little demo repo: 
>>>>> https://github.com/viktorvorobev/proto_leak
>>>>>
>>>>> But basically the situation is as follows.
>>>>> If you create a proto object, then use `ParseFromString` on it 
>>>>> multiple times, then the object size grows uncontrollably until this 
>>>>> object 
>>>>> is deleted:
>>>>>
>>>>> obj = schema_pb2.value_test_topic()
>>>>> for _ in range(10_000_000):
>>>>>      # here the object will grow
>>>>>     obj.ParseFromString(b"...")
>>>>> del obj
>>>>>
>>>>> But if you recreate an object every time, then everything seems to be 
>>>>> fine, so this code doesn't leak:
>>>>>
>>>>> for _ in range(10_000_000):
>>>>>     obj = schema_pb2.value_test_topic()
>>>>>     obj.ParseFromString(b"...")
>>>>>
>>>>> I suspect the same behaviour for Unpack, but didn't test it myself.
>>>>>
>>>>> Is this known or intended?
>>>>>
>>>>> Thanks a lot! 
>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Protocol Buffers" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> To view this discussion visit 
>>>>> https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/protobuf/0395429c-f797-40fd-9dee-e123ed5d4f84n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/protobuf/b716b0a0-af12-40c0-8a21-27618ace8d3en%40googlegroups.com.

Reply via email to