Okay, well it's slightly more complicated. My C++ application needs to
actually accept the technically invalid code points U+FFFF and U+FFFE.
Otherwise, I need my server application to know when invalid UTF-8 has
happened. That's all fine. I have that all implemented. That's good.
The problem is I want to exercise that behavior from my Python systest
framework. The problem is the Python libs are trying to be too
helpful. While I normally want them to do UTF-8 validation, I *don't*
want them to during the systests, because I want to send bad UTF-8 to
Make sense? I'm trying to do bad things to make sure stuff still works
in a systest environment.
On Mon, May 17, 2010 at 4:51 PM, Jason Hsueh <jas...@google.com> wrote:
> If you compile with the macro GOOGLE_PROTOBUF_UTF8_VALIDATION_ENABLED
> defined, the C++ code will do UTF8 validation. However, it doesn't prevent
> the data from serializing or parsing, it will simply log an error message.
> How would you like it to fail?
> On Mon, May 17, 2010 at 3:15 PM, JT Olds <jto...@xnet5.com> wrote:
>> (I submitted this already via the protobuf google group web form, but
>> I think I screwed up. If not, sorry for the double post)
>> I have a C++-based server using protocol buffers as the IDL, and I'm
>> trying to ensure that it rejects invalid UTF-8 strings. My systest
>> library is written in Python. The C++ protocol buffer library does not
>> seem to do any UTF-8 string checking on string types, whereas the
>> Python library does. So I added some UTF-8 validation testing to the
>> C++ server-side and I want to check that it works (in case a C++
>> client sends invalid UTF-8). Whenever I inject invalid UTF-8 into the
>> Python systests to make sure the server rejects the string, the Python
>> library complains.
>> Is there a way to override this behavior?
>> I don't want to change my protocol buffer definitions to be the bytes
>> type, because these really should be strings, and the Python library
>> is doing exactly what I want for the general case.
>> You received this message because you are subscribed to the Google Groups
>> "Protocol Buffers" group.
>> To post to this group, send email to proto...@googlegroups.com.
>> To unsubscribe from this group, send email to
>> For more options, visit this group at
You received this message because you are subscribed to the Google Groups
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to
For more options, visit this group at