I *think* if you use a proto2 syntax message it actually will not perform this check as of today (only proto3 syntax file).
If that's not right, I unfortunately suspect the only way around it would be vendor the protobuf runtime into your codebase and comment out the check / log if its bothering you. On Friday, September 6, 2024 at 11:43:28 AM UTC-4 [email protected] wrote: > Thank you for the detailed answer Em, I really appreciate it! > > Good to know the warning can probably be ignored for now. I've opted to do > the repeated option for now to avoid my logs being drowned in the > warnings... I take it there is no way to suppress warnings? > > Best, > Florian > > On Thursday, September 5, 2024 at 5:19:00 PM UTC-4 Em Rauch wrote: > >> Using non-UTF8 data in a string field should be understood as incorrect, >> but realistically will work today as long as your messages are only used >> exactly by C++ Protobuf on the current release of protobuf and only ever >> with the binary wire format (not textproto or JSON encoding, etc). >> >> Today the malformed utf8 enforcement exists to different degrees in the >> different languages (and even depending on the syntax of the .proto file), >> but its not semantically intended that a `string` field should be used for >> non-utf8 data in any language. It should be assumed that a serialized >> message with a map<string, ?> where the keys are non-utf8 may start to >> parse-fail in some future release of Protobuf. >> >> Unfortunately bytes as a map key isn't allowed due to obscure technical >> concerns related to some non-C++ languages and the JSON representation, and >> we don't have an immediate plan to relax that. >> >> Realistically your options are: >> - Keep doing what you're doing, only ever keep these messages in C++ and >> binary wire encoding, ignore the warnings, know that it might stop working >> if a future release of protobuf >> - Make your key data be valid utf8 strings instead (eg, use a base64 >> encoding of the digest instead of the raw digest bytes) >> - Use repeated of a message with a key and value field instead of a map, >> and use your own struct as the in-memory representation when processing >> (move the data into/out of a STL map at the parse/serialization boundaries >> instead). >> >> Sorry there's not a more trivial fix available for this usecase! >> >> On Thursday, September 5, 2024 at 5:03:03 PM UTC-4 [email protected] >> wrote: >> >>> Hi, >>> >>> I've been using protobuf 3.5.1 in c++ and am using a message type with >>> the following map type: `map<string, MyObject> txns = 1` >>> >>> It is my understanding that `string` and `bytes` are the same in proto >>> c++; for maps however one can only use `string` as keys. I'm using the key >>> field to send around transaction digests which are byte strings consisting >>> of cryptographic hashes. As far as I can tell, it makes no difference >>> whether I use strings/bytes (the decoding works), yet I keep getting the >>> error: >>> >>> `String field 'pequinstore.proto.MergedSnapshot.MergedTxnsEntry.key' >>> contains invalid UTF-8 data when serializing a protocol buffer. Use the >>> 'bytes' type if you intend to send raw bytes.` >>> >>> I understand the error is complaining about my digests possibly not >>> being UTF-8, but I'm unsure if I actually need to be concerned about it; I >>> have not noticed any problems with parsing. Is there a way to suppress this >>> error? >>> >>> Or, if this is a serious error that could lead to non-deterministic >>> behavior, do you have a suggested workaround? There is a lot of existing >>> code that uses the map structure akin to an STL map, so I'd like to avoid >>> re-factoring the protobuf into a repeated field if possible. >>> >>> Thanks, >>> Florian >>> >> -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/protobuf/553698bd-9410-42fa-be51-989ba0e1a146n%40googlegroups.com.
