RE: intermittent issue with encode (version 2.0.3)
Is there something I need to do in the application to cause that to happen before I've spun off multiple threads? Currently I spin up threads, which then start making calls to fill in and then encode google protobuffers. From: Kenton Varda [mailto:ken...@google.com] Sent: Thursday, July 09, 2009 6:05 PM To: Rizzuto, Raymond Cc: protobuf@googlegroups.com Subject: Re: intermittent issue with encode (version 2.0.3) As the comment says, the first call will always occur at startup time when there is only one thread anyway, so it's perfectly safe. The parenthetical about GCC4 is just an aside. On Thu, Jul 9, 2009 at 2:47 PM, Rizzuto, Raymond raymond.rizz...@sig.commailto:raymond.rizz...@sig.com wrote: I am a bit nervous about the GCC4 comment in GeneratedMessageFactory::singleton (message.cc): // No need for thread-safety here because this will be called at static // initialization time. (And GCC4 makes this thread-safe anyway.) I'm using gcc 3.3.3. The singleton object in GeneratedMessageFactory::singleton, is a local static of non-POD type. The C++ standard says: An implementation is permitted to perform early initialization of other local objects with static storage duration under the same conditions that an implementation is permitted to statically initialize an object with static storage duration in namespace scope (3.6.2). Otherwise such an object is initialized the first time control passes through its declaration; such an object is considered initialized upon the completion of its initialization. I don't think the language standard addresses what first time control passes through its declaration means when two threads call the function simultaneously. Perhaps gcc4 provides features that make that safe. I don't know if that is something that can be relied on in all compilers, however. Ray From: Kenton Varda [mailto:ken...@google.commailto:ken...@google.com] Sent: Thursday, July 09, 2009 5:08 PM To: Rizzuto, Raymond Cc: protobuf@googlegroups.commailto:protobuf@googlegroups.com Subject: Re: intermittent issue with encode (version 2.0.3) I suppose you could also temporarily edit the header file. On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond raymond.rizz...@sig.commailto:raymond.rizz...@sig.com wrote: I'm trying to, without success. Breakpoints in header files, at least with the version of tools I have, don't work very well. From: Kenton Varda [mailto:ken...@google.commailto:ken...@google.com] Sent: Thursday, July 09, 2009 5:02 PM To: Rizzuto, Raymond Cc: protobuf@googlegroups.commailto:protobuf@googlegroups.com Subject: Re: intermittent issue with encode (version 2.0.3) Run in a debugger and set a breakpoint at wire_format_inl.h:289. On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond raymond.rizz...@sig.commailto:raymond.rizz...@sig.com wrote: I think I have an error in my code (C++) that only occurs when I have multiple threads, and a lot of message volume. Even then, I can run the same test many times, but only get a failure on some runs. With 7 threads running on a 4 core machine, and generating 480384 google protocol buffer messages, I get 33 errors like this to stdout: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289] Encountered string containing invalid UTF-8 data while serializing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. I believe that the data is in error since I get similar errors decoding the messages: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138] Encountered string containing invalid UTF-8 data while parsing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. Is there any way that I can check for this at run time so that I can print out more context? I do call IsInitialized before serializing, but that doesn't check for this case. I am running on SLES9SP4, using gcc 3.3.3 as the compiler. Ray Ray Rizzuto raymond.rizz...@sig.commailto:raymond.rizz...@sig.com Susquehanna International Group (610)747-2336 (W) (215)776-3780 (C) IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument
Re: intermittent issue with encode (version 2.0.3)
Run in a debugger and set a breakpoint at wire_format_inl.h:289. On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond raymond.rizz...@sig.comwrote: I think I have an error in my code (C++) that only occurs when I have multiple threads, and a lot of message volume. Even then, I can run the same test many times, but only get a failure on some runs. With 7 threads running on a 4 core machine, and generating 480384 google protocol buffer messages, I get 33 errors like this to stdout: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289] Encountered string containing invalid UTF-8 data while serializing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. I believe that the data is in error since I get similar errors decoding the messages: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138] Encountered string containing invalid UTF-8 data while parsing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. Is there any way that I can check for this at run time so that I can print out more context? I do call IsInitialized before serializing, but that doesn’t check for this case. I am running on SLES9SP4, using gcc 3.3.3 as the compiler. Ray -- Ray Rizzuto raymond.rizz...@sig.com Susquehanna International Group (610)747-2336 (W) (215)776-3780 (C) -- IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
RE: intermittent issue with encode (version 2.0.3)
I'm trying to, without success. Breakpoints in header files, at least with the version of tools I have, don't work very well. From: Kenton Varda [mailto:ken...@google.com] Sent: Thursday, July 09, 2009 5:02 PM To: Rizzuto, Raymond Cc: protobuf@googlegroups.com Subject: Re: intermittent issue with encode (version 2.0.3) Run in a debugger and set a breakpoint at wire_format_inl.h:289. On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond raymond.rizz...@sig.commailto:raymond.rizz...@sig.com wrote: I think I have an error in my code (C++) that only occurs when I have multiple threads, and a lot of message volume. Even then, I can run the same test many times, but only get a failure on some runs. With 7 threads running on a 4 core machine, and generating 480384 google protocol buffer messages, I get 33 errors like this to stdout: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289] Encountered string containing invalid UTF-8 data while serializing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. I believe that the data is in error since I get similar errors decoding the messages: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138] Encountered string containing invalid UTF-8 data while parsing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. Is there any way that I can check for this at run time so that I can print out more context? I do call IsInitialized before serializing, but that doesn't check for this case. I am running on SLES9SP4, using gcc 3.3.3 as the compiler. Ray Ray Rizzuto raymond.rizz...@sig.commailto:raymond.rizz...@sig.com Susquehanna International Group (610)747-2336 (W) (215)776-3780 (C) IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses. IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Re: intermittent issue with encode (version 2.0.3)
I suppose you could also temporarily edit the header file. On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond raymond.rizz...@sig.comwrote: I’m trying to, without success. Breakpoints in header files, at least with the version of tools I have, don’t work very well. -- *From:* Kenton Varda [mailto:ken...@google.com] *Sent:* Thursday, July 09, 2009 5:02 PM *To:* Rizzuto, Raymond *Cc:* protobuf@googlegroups.com *Subject:* Re: intermittent issue with encode (version 2.0.3) Run in a debugger and set a breakpoint at wire_format_inl.h:289. On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond raymond.rizz...@sig.com wrote: I think I have an error in my code (C++) that only occurs when I have multiple threads, and a lot of message volume. Even then, I can run the same test many times, but only get a failure on some runs. With 7 threads running on a 4 core machine, and generating 480384 google protocol buffer messages, I get 33 errors like this to stdout: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289] Encountered string containing invalid UTF-8 data while serializing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. I believe that the data is in error since I get similar errors decoding the messages: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138] Encountered string containing invalid UTF-8 data while parsing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. Is there any way that I can check for this at run time so that I can print out more context? I do call IsInitialized before serializing, but that doesn’t check for this case. I am running on SLES9SP4, using gcc 3.3.3 as the compiler. Ray -- Ray Rizzuto raymond.rizz...@sig.com Susquehanna International Group (610)747-2336 (W) (215)776-3780 (C) -- IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses. -- IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
RE: intermittent issue with encode (version 2.0.3)
I'm going to try that. Since another group builds and packages the libraries I use, it'll take a bit to make a private copy with that change. As an enhancement request, I wish there was a function I could call to validate the message content before serialize, that would tell me about any fields of the message that are in error. I.e. so I could catch that issue similarly to catching uninitialized fields: if (!m.IsInitialized()) { std::string error = name + is missing fields: ; std::vectorstd::string errors; m.FindInitializationErrors(errors); std::vectorstd::string::const_iterator it; for(it = errors.begin(); it!= errors.end(); ++it) { if (it != errors.begin()) error += , ; error += *it; } throw SPException(error.c_str()); } It might not be something I'd do in production, but it sure would help during development. From: Kenton Varda [mailto:ken...@google.com] Sent: Thursday, July 09, 2009 5:08 PM To: Rizzuto, Raymond Cc: protobuf@googlegroups.com Subject: Re: intermittent issue with encode (version 2.0.3) I suppose you could also temporarily edit the header file. On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond raymond.rizz...@sig.commailto:raymond.rizz...@sig.com wrote: I'm trying to, without success. Breakpoints in header files, at least with the version of tools I have, don't work very well. From: Kenton Varda [mailto:ken...@google.commailto:ken...@google.com] Sent: Thursday, July 09, 2009 5:02 PM To: Rizzuto, Raymond Cc: protobuf@googlegroups.commailto:protobuf@googlegroups.com Subject: Re: intermittent issue with encode (version 2.0.3) Run in a debugger and set a breakpoint at wire_format_inl.h:289. On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond raymond.rizz...@sig.commailto:raymond.rizz...@sig.com wrote: I think I have an error in my code (C++) that only occurs when I have multiple threads, and a lot of message volume. Even then, I can run the same test many times, but only get a failure on some runs. With 7 threads running on a 4 core machine, and generating 480384 google protocol buffer messages, I get 33 errors like this to stdout: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289] Encountered string containing invalid UTF-8 data while serializing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. I believe that the data is in error since I get similar errors decoding the messages: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138] Encountered string containing invalid UTF-8 data while parsing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. Is there any way that I can check for this at run time so that I can print out more context? I do call IsInitialized before serializing, but that doesn't check for this case. I am running on SLES9SP4, using gcc 3.3.3 as the compiler. Ray Ray Rizzuto raymond.rizz...@sig.commailto:raymond.rizz...@sig.com Susquehanna International Group (610)747-2336 (W) (215)776-3780 (C) IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses. IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties
Re: intermittent issue with encode (version 2.0.3)
This is something you can do in your own code -- just call your validation function before serializing. If this were to be a feature of protocol buffers, then we'd have to store a pointer to your validator function somewhere. Storing it in the message object itself would harm performance and memory usage, but storing it in a static location (such that it applies to all instances of the type) would bring all the myriad problems commonly associated with singletons. So I don't think there's any reasonable way for the protobuf system to provide this. On Thu, Jul 9, 2009 at 2:14 PM, Rizzuto, Raymond raymond.rizz...@sig.comwrote: I’m going to try that. Since another group builds and packages the libraries I use, it’ll take a bit to make a private copy with that change. As an enhancement request, I wish there was a function I could call to validate the message content before serialize, that would tell me about any fields of the message that are in error. I.e. so I could catch that issue similarly to catching uninitialized fields: if (!m.IsInitialized()) { std::string error = name + is missing fields: ; std::vectorstd::string errors; m.FindInitializationErrors(errors); std::vectorstd::string::const_iterator it; for(it = errors.begin(); it!= errors.end(); ++it) { if (it != errors.begin()) error += , ; error += *it; } throw SPException(error.c_str()); } It might not be something I’d do in production, but it sure would help during development. -- *From:* Kenton Varda [mailto:ken...@google.com] *Sent:* Thursday, July 09, 2009 5:08 PM *To:* Rizzuto, Raymond *Cc:* protobuf@googlegroups.com *Subject:* Re: intermittent issue with encode (version 2.0.3) I suppose you could also temporarily edit the header file. On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond raymond.rizz...@sig.com wrote: I’m trying to, without success. Breakpoints in header files, at least with the version of tools I have, don’t work very well. -- *From:* Kenton Varda [mailto:ken...@google.com] *Sent:* Thursday, July 09, 2009 5:02 PM *To:* Rizzuto, Raymond *Cc:* protobuf@googlegroups.com *Subject:* Re: intermittent issue with encode (version 2.0.3) Run in a debugger and set a breakpoint at wire_format_inl.h:289. On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond raymond.rizz...@sig.com wrote: I think I have an error in my code (C++) that only occurs when I have multiple threads, and a lot of message volume. Even then, I can run the same test many times, but only get a failure on some runs. With 7 threads running on a 4 core machine, and generating 480384 google protocol buffer messages, I get 33 errors like this to stdout: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289] Encountered string containing invalid UTF-8 data while serializing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. I believe that the data is in error since I get similar errors decoding the messages: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138] Encountered string containing invalid UTF-8 data while parsing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. Is there any way that I can check for this at run time so that I can print out more context? I do call IsInitialized before serializing, but that doesn’t check for this case. I am running on SLES9SP4, using gcc 3.3.3 as the compiler. Ray -- Ray Rizzuto raymond.rizz...@sig.com Susquehanna International Group (610)747-2336 (W) (215)776-3780 (C) -- IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses. -- IMPORTANT: The information contained in this email and/or its attachments is confidential
Re: intermittent issue with encode (version 2.0.3)
Sorry, I think I misread your message. You just want there to me a method like IsInitialized() that you can call to validate UTF-8 stuff. I'll think about that. On Thu, Jul 9, 2009 at 2:32 PM, Kenton Varda ken...@google.com wrote: This is something you can do in your own code -- just call your validation function before serializing. If this were to be a feature of protocol buffers, then we'd have to store a pointer to your validator function somewhere. Storing it in the message object itself would harm performance and memory usage, but storing it in a static location (such that it applies to all instances of the type) would bring all the myriad problems commonly associated with singletons. So I don't think there's any reasonable way for the protobuf system to provide this. On Thu, Jul 9, 2009 at 2:14 PM, Rizzuto, Raymond raymond.rizz...@sig.comwrote: I’m going to try that. Since another group builds and packages the libraries I use, it’ll take a bit to make a private copy with that change. As an enhancement request, I wish there was a function I could call to validate the message content before serialize, that would tell me about any fields of the message that are in error. I.e. so I could catch that issue similarly to catching uninitialized fields: if (!m.IsInitialized()) { std::string error = name + is missing fields: ; std::vectorstd::string errors; m.FindInitializationErrors(errors); std::vectorstd::string::const_iterator it; for(it = errors.begin(); it!= errors.end(); ++it) { if (it != errors.begin()) error += , ; error += *it; } throw SPException(error.c_str()); } It might not be something I’d do in production, but it sure would help during development. -- *From:* Kenton Varda [mailto:ken...@google.com] *Sent:* Thursday, July 09, 2009 5:08 PM *To:* Rizzuto, Raymond *Cc:* protobuf@googlegroups.com *Subject:* Re: intermittent issue with encode (version 2.0.3) I suppose you could also temporarily edit the header file. On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond raymond.rizz...@sig.com wrote: I’m trying to, without success. Breakpoints in header files, at least with the version of tools I have, don’t work very well. -- *From:* Kenton Varda [mailto:ken...@google.com] *Sent:* Thursday, July 09, 2009 5:02 PM *To:* Rizzuto, Raymond *Cc:* protobuf@googlegroups.com *Subject:* Re: intermittent issue with encode (version 2.0.3) Run in a debugger and set a breakpoint at wire_format_inl.h:289. On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond raymond.rizz...@sig.com wrote: I think I have an error in my code (C++) that only occurs when I have multiple threads, and a lot of message volume. Even then, I can run the same test many times, but only get a failure on some runs. With 7 threads running on a 4 core machine, and generating 480384 google protocol buffer messages, I get 33 errors like this to stdout: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289] Encountered string containing invalid UTF-8 data while serializing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. I believe that the data is in error since I get similar errors decoding the messages: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138] Encountered string containing invalid UTF-8 data while parsing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. Is there any way that I can check for this at run time so that I can print out more context? I do call IsInitialized before serializing, but that doesn’t check for this case. I am running on SLES9SP4, using gcc 3.3.3 as the compiler. Ray -- Ray Rizzuto raymond.rizz...@sig.com Susquehanna International Group (610)747-2336 (W) (215)776-3780 (C) -- IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy
RE: intermittent issue with encode (version 2.0.3)
I am a bit nervous about the GCC4 comment in GeneratedMessageFactory::singleton (message.cc): // No need for thread-safety here because this will be called at static // initialization time. (And GCC4 makes this thread-safe anyway.) I'm using gcc 3.3.3. The singleton object in GeneratedMessageFactory::singleton, is a local static of non-POD type. The C++ standard says: An implementation is permitted to perform early initialization of other local objects with static storage duration under the same conditions that an implementation is permitted to statically initialize an object with static storage duration in namespace scope (3.6.2). Otherwise such an object is initialized the first time control passes through its declaration; such an object is considered initialized upon the completion of its initialization. I don't think the language standard addresses what first time control passes through its declaration means when two threads call the function simultaneously. Perhaps gcc4 provides features that make that safe. I don't know if that is something that can be relied on in all compilers, however. Ray From: Kenton Varda [mailto:ken...@google.com] Sent: Thursday, July 09, 2009 5:08 PM To: Rizzuto, Raymond Cc: protobuf@googlegroups.com Subject: Re: intermittent issue with encode (version 2.0.3) I suppose you could also temporarily edit the header file. On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond raymond.rizz...@sig.commailto:raymond.rizz...@sig.com wrote: I'm trying to, without success. Breakpoints in header files, at least with the version of tools I have, don't work very well. From: Kenton Varda [mailto:ken...@google.commailto:ken...@google.com] Sent: Thursday, July 09, 2009 5:02 PM To: Rizzuto, Raymond Cc: protobuf@googlegroups.commailto:protobuf@googlegroups.com Subject: Re: intermittent issue with encode (version 2.0.3) Run in a debugger and set a breakpoint at wire_format_inl.h:289. On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond raymond.rizz...@sig.commailto:raymond.rizz...@sig.com wrote: I think I have an error in my code (C++) that only occurs when I have multiple threads, and a lot of message volume. Even then, I can run the same test many times, but only get a failure on some runs. With 7 threads running on a 4 core machine, and generating 480384 google protocol buffer messages, I get 33 errors like this to stdout: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289] Encountered string containing invalid UTF-8 data while serializing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. I believe that the data is in error since I get similar errors decoding the messages: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138] Encountered string containing invalid UTF-8 data while parsing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. Is there any way that I can check for this at run time so that I can print out more context? I do call IsInitialized before serializing, but that doesn't check for this case. I am running on SLES9SP4, using gcc 3.3.3 as the compiler. Ray Ray Rizzuto raymond.rizz...@sig.commailto:raymond.rizz...@sig.com Susquehanna International Group (610)747-2336 (W) (215)776-3780 (C) IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses. IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other
Re: intermittent issue with encode (version 2.0.3)
As the comment says, the first call will always occur at startup time when there is only one thread anyway, so it's perfectly safe. The parenthetical about GCC4 is just an aside. On Thu, Jul 9, 2009 at 2:47 PM, Rizzuto, Raymond raymond.rizz...@sig.comwrote: I am a bit nervous about the GCC4 comment in GeneratedMessageFactory::singleton (message.cc): // No need for thread-safety here because this will be called at static // initialization time. (And GCC4 makes this thread-safe anyway.) I’m using gcc 3.3.3. The singleton object in GeneratedMessageFactory::singleton, is a local static of non-POD type. The C++ standard says: An implementation is permitted to perform early initialization of other local objects with static storage duration under the same conditions that an implementation is permitted to statically initialize an object with static storage duration in namespace scope (3.6.2). Otherwise such an object is initialized the first time control passes through its declaration; such an object is considered initialized upon the completion of its initialization. I don’t think the language standard addresses what “first time control passes through its declaration” means when two threads call the function simultaneously. Perhaps gcc4 provides features that make that safe. I don’t know if that is something that can be relied on in all compilers, however. Ray -- *From:* Kenton Varda [mailto:ken...@google.com] *Sent:* Thursday, July 09, 2009 5:08 PM *To:* Rizzuto, Raymond *Cc:* protobuf@googlegroups.com *Subject:* Re: intermittent issue with encode (version 2.0.3) I suppose you could also temporarily edit the header file. On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond raymond.rizz...@sig.com wrote: I’m trying to, without success. Breakpoints in header files, at least with the version of tools I have, don’t work very well. -- *From:* Kenton Varda [mailto:ken...@google.com] *Sent:* Thursday, July 09, 2009 5:02 PM *To:* Rizzuto, Raymond *Cc:* protobuf@googlegroups.com *Subject:* Re: intermittent issue with encode (version 2.0.3) Run in a debugger and set a breakpoint at wire_format_inl.h:289. On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond raymond.rizz...@sig.com wrote: I think I have an error in my code (C++) that only occurs when I have multiple threads, and a lot of message volume. Even then, I can run the same test many times, but only get a failure on some runs. With 7 threads running on a 4 core machine, and generating 480384 google protocol buffer messages, I get 33 errors like this to stdout: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289] Encountered string containing invalid UTF-8 data while serializing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. I believe that the data is in error since I get similar errors decoding the messages: libprotobuf ERROR /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138] Encountered string containing invalid UTF-8 data while parsing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes. Is there any way that I can check for this at run time so that I can print out more context? I do call IsInitialized before serializing, but that doesn’t check for this case. I am running on SLES9SP4, using gcc 3.3.3 as the compiler. Ray -- Ray Rizzuto raymond.rizz...@sig.com Susquehanna International Group (610)747-2336 (W) (215)776-3780 (C) -- IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses. -- IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment