This is the pb_string before calling encoded('utf-8')
\nJhow-to-use-a-chafing-dish\x12 How To Use A Chafing Dish (1912)\x1a
\x18Sarah Tyson Heston Rorer"\n143687825X*
\r97814368782582\tpaperback8\xd8\...@\x01r\x14kessinger PublishingX|h
\xd0\x08p\x89\x07x\xc7\x01\x82\x015img-436878258.jpg
\x92\x01\x07English

This is the pb_string after calling encode('utf-8')
\nJhow-to-use-a-chafing-dish\x12 How To Use A Chafing Dish (1912)\x1a
\x18Sarah Tyson Heston Rorer\x1a\x18Sarah Tyson Heston
Rorer"\n143687825x*\r97814368782582\tpaperback8\xd8\...@\x01r
\x14Kessinger PublishingX|b\x00h\xd0\x08p\x89\x07x
\xc7\x01\x82\x015img-436878258.jpg\x92\x01\x07English

On Mar 21, 12:39 am, saad.a...@gmail.com wrote:
> with string type it gives the following error:
> libprotobuf ERROR ./google/protobuf/wire_format_inl.h:138] Encountered
> string containing invalid UTF-8 data while parsing protocol buffer.
> Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>
> When I change it to byte, it shows:
> cout << t.title() << endl;
> x 5image_url English edit home movies, and share MP3s in one
>
> instead of showing:
> cout << t.title() << endl;
> How To Use A Chafing Dish (1912)
>
> On Mar 21, 12:18 am, Kenton Varda <ken...@google.com> wrote:
>
> > Hmm, that sound very odd.  I think protocol buffers should be taking care of
> > this automatically.  Can you give us an example of the "gibberish" and what
> > you expected it to look like?
>
> > On Fri, Mar 20, 2009 at 12:09 PM, <saad.a...@gmail.com> wrote:
>
> > > oh sorry for the confusion, when i change it to byte, its not giving
> > > the error, but the value is gibberish which contains some special
> > > characters and values of 2-3 fields together.
>
> > > I don't know what the problem is but I found the solution, here is
> > > what I am doing:
> > > I searched online and read some python docs and then I wrote another
> > > python script and processing each protobuf data like:
>
> > > t = Title()
> > > t.ParseFromString(title_str_pb)
>
> > > t.title = t.title.encode('utf-8')
> > > t.description = t.description.encode('utf-8')
> > > t.isbn = t.isbn.encode('utf-8')
> > > ...
> > > ...
>
> > > and then writing it back to my database
> > > title_str_pb = t.SerializeToString()
>
> > > and now when I open it in c++, its not giving any error.
>
> > > So, I think when I was adding the original data, I should have
> > > called .encode('utf-8') on all the python strings.
>
> > > Is there anything I am missing, or easy way to do it.
>
> > > On Mar 20, 11:38 pm, Kenton Varda <ken...@google.com> wrote:
> > > > If you changed all the "string" types to "bytes" instead, then you 
> > > > should
> > > > not see that error.  Are you sure you did that?  If so, can you write a
> > > > small demo program which produces this error, even when the protobuf 
> > > > type
> > > > contains no "string" fields, and send it to me?
>
> > > > On Fri, Mar 20, 2009 at 11:16 AM, <saad.a...@gmail.com> wrote:
>
> > > > > I am not very experienced programmer, but I will try to explain whats
> > > > > happening:
>
> > > > > I have books titles database in protocol buffer format. The message
> > > > > Title has fields like:
> > > > > optional string title = 1;
> > > > > optional string description = 2;
> > > > > optional string isbn = 3
> > > > > ...
> > > > > ...
>
> > > > > When I convert my mysql data to pb, i use python and store it using
> > > > > title_str_pb = title.SerializeToString()
>
> > > > > When I read back the titles in python, everything works fine. Like:
> > > > > t = Title()
> > > > > t.ParseFromString(title_str_pb)
> > > > > title = t.title
> > > > > description = t.description
>
> > > > > But now I want to use this protocol buffer data in c++. like:
> > > > > Title t;
> > > > > t.ParseFromString(title_str_pb)
>
> > > > > and I get error:
> > > > > Encountered string containing invalid UTF-8 data while parsing
> > > > > protocol buffer. Strings must contain only UTF-8; use the 'bytes' type
> > > > > for raw bytes.
>
> > > > > I changed the string type to bytes type, then also I get the same
> > > > > error.
>
> > > > > I have a million book records stored in pb format. I don't want to
> > > > > loose my data. Can somebody help please. As an alternative I will
> > > > > restore my data back using python. But I want to use it in c++.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to