The context is that Message-typed fields (and string-typed fields) are
encoded with an int32 of "number of bytes that follow this are this
message". It's not possible to encode a message which is larger than that
from a binary wire format technical point of view, and thats an ecosystem
wide implication.

This leaves some wiggle room though, notably top level messages are not
encoded with a length prefix, which means they don't have any such
technical constraint. But also more notably, if you just construct a
message in memory and then call some setters it will build up the
arbitrarily large message.

> May be a bit of context here would help, I am coming from the point of
view https://groups.google.com/g/protobuf/c/vvP4uajRE60
> If the potential fix for it was to set limit to 2g in message_lite.c,

Without other context and doing more archeology, I actually suspect the
'attack' was more that e.g. sufficiently smart attackers could send a
string which is length "2GB minus one byte", and then know that the service
boxes up that input in a protobuf message (adding a few bytes over
overhead), and then encode that to the next backend server.

And the fix C++ issue back then was not simply to try to enforce a
conceptual limit on 2GB, it instead required changing the C++ API to use a
`long` (int64) for the encoded size of messages instead of an int32 (the
size getter is called `ByteSizeLong()`). That made it much easier to write
correct behavior against 2GB limits; because when you have an `int
EncodedLength()` function, once you e.g. have 10 strings that are each
512MB, set them all as separate fields on the same parent message, then try
to see what the `int` serialized size should be, there's no way to handle
it gracefully. By making it a `long` instead it is able to return the
actual size without a 2GB limit, and then if you try to serialize a message
where that size is too large it will fail to serialize (serialize has a
bool return value on it).

On Fri, Jul 4, 2025 at 10:40 AM 'Somak Dutta' via Protocol Buffers <
protobuf@googlegroups.com> wrote:

> Exactly, could not agree more. There are current limit set to
> Integer.MAX_VALUE in CodedInputStream
>
> May be a bit of context here would help, I am coming from the point of
> view https://groups.google.com/g/protobuf/c/vvP4uajRE60
>
> If the potential fix for it was to set limit to 2g in message_lite.c, in
> memory safe language like Java it is anyways default to 2g. I wonder if the
> vulnerability data in the world that marks java as impacted by the
> vulnerability is really over estimating.
>
> ```
>
> Somak Dutta
> Jul 3, 2025, 1:51:24 PM (yesterday)
> 
> 
> 
> to Protocol Buffers
> Hi,
>
> I am writing to ask about vulnerability reported GHSA-jwvw-v7c5-m82h
> <https://github.com/advisories/GHSA-jwvw-v7c5-m82h> for protobuf-java
> <https://mvnrepository.com/artifact/com.google.protobuf/protobuf-java> which
> specifically talks about "*protobuf allows remote authenticated attackers
> to cause a heap-based buffer overflow.*"
>
> Specifically to ask about earlier versions < 3.4.0.
> Take for example a version 2.5.0, based on all the code i see for
> CodedInputStream
> <https://github.com/protocolbuffers/protobuf/blob/v2.5.0/java/src/main/java/com/google/protobuf/CodedInputStream.java>
> - methods such as readRawBytes/refillBuffer, which are performing either
> copy to/from or resizing , are all pretty safe from integer overflows.
> - there is also present a slow path, where we read buffer in chunks to
> potentially prevent out of memory issues.
>
> First Question:
> However i am not seeing any evidence where the package can be vulnerable
> to a buffer overflows issues
> Additionally given java is memory safe language i am failing to see how
> java ecosystem is susceptible to the afore mentioned vulnerability.
>
> Second Question:
> There is a question related / or along the same veins here
> https://github.com/protocolbuffers/protobuf/issues/760?reload=1#issuecomment-847162817
>  .
> The potential fix also suggests issue might be present only in c/c++
> ecosystems.
>
> ```
>
> Regards,
> Somak
>
> On Friday, July 4, 2025 at 3:27:21 PM UTC+5:30 Cassondra Foesch wrote:
>
>> I’m pretty sure that since 2 GiB is the maximum value an int32 could
>> carry, that is where the requirement is coming from. It’s entirely possible
>> that it is not actually enforced across the whole ecosystem, but is
>> essentially enforced by “if you exceed this boundary, some code will not
>> work with your protobuf.”
>>
>> Like, for instance, it is impossible for a 32-bit Golang implementation
>> do deal with more than 2 GiB data in a single slice. (Since the length of
>> the slice is stored as a 32-bit signed integer.)
>>
>> Am Do., 3. Juli 2025 um 08:21 Uhr schrieb 'Somak Dutta' via Protocol
>> Buffers <prot...@googlegroups.com>:
>>
>>> Hello,
>>>
>>> From https://protobuf.dev/programming-guides/proto-limits/ i understand
>>> across all ecosystems
>>>
>>> Any proto in serialized form must be <2GiB, as that is the maximum size
>>> supported by all implementations. It’s recommended to bound request and
>>> response sizes.
>>>
>>> However wanted to check where exactly is the limitation set up,
>>> specifically in protobuf-java library.
>>>
>>> I can see safe checks in only message_lite.cc files , but i dont think
>>> this would be reflected across ecosystems?
>>>
>>> if (size > INT_MAX) {
>>> GOOGLE_LOG(ERROR) << "Exceeded maximum protobuf size of 2GB: " << size;
>>> return false;
>>> }
>>>
>>> Regards
>>>
>>> *Confidentiality Notice: This email and any attachments are confidential
>>> and intended solely for the use of the individual or entity to whom they
>>> are addressed. If you have received this email in error, please notify the
>>> sender immediately and delete it from your system. Unauthorized use,
>>> disclosure, or copying of this email or its contents is strictly
>>> prohibited.*
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Protocol Buffers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to protobuf+u...@googlegroups.com.
>>> To view this discussion visit
>>> https://groups.google.com/d/msgid/protobuf/e0d724d8-2a45-4ef1-aaac-c3e6d1077306n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/protobuf/e0d724d8-2a45-4ef1-aaac-c3e6d1077306n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
> *Confidentiality Notice: This email and any attachments are confidential
> and intended solely for the use of the individual or entity to whom they
> are addressed. If you have received this email in error, please notify the
> sender immediately and delete it from your system. Unauthorized use,
> disclosure, or copying of this email or its contents is strictly
> prohibited.*
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to protobuf+unsubscr...@googlegroups.com.
> To view this discussion visit
> https://groups.google.com/d/msgid/protobuf/c75ea739-28b6-48fd-9394-3d13499d47ben%40googlegroups.com
> <https://groups.google.com/d/msgid/protobuf/c75ea739-28b6-48fd-9394-3d13499d47ben%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/protobuf/CAKRmVH-FgxWQPwU3h7%2B%2Bk8OxsWk6upbdsYADh2hzFKi4DmugTg%40mail.gmail.com.

Reply via email to