Hmm, that sound very odd.  I think protocol buffers should be taking care of
this automatically.  Can you give us an example of the "gibberish" and what
you expected it to look like?

On Fri, Mar 20, 2009 at 12:09 PM, <saad.a...@gmail.com> wrote:

>
> oh sorry for the confusion, when i change it to byte, its not giving
> the error, but the value is gibberish which contains some special
> characters and values of 2-3 fields together.
>
> I don't know what the problem is but I found the solution, here is
> what I am doing:
> I searched online and read some python docs and then I wrote another
> python script and processing each protobuf data like:
>
> t = Title()
> t.ParseFromString(title_str_pb)
>
> t.title = t.title.encode('utf-8')
> t.description = t.description.encode('utf-8')
> t.isbn = t.isbn.encode('utf-8')
> ...
> ...
>
> and then writing it back to my database
> title_str_pb = t.SerializeToString()
>
> and now when I open it in c++, its not giving any error.
>
> So, I think when I was adding the original data, I should have
> called .encode('utf-8') on all the python strings.
>
> Is there anything I am missing, or easy way to do it.
>
>
> On Mar 20, 11:38 pm, Kenton Varda <ken...@google.com> wrote:
> > If you changed all the "string" types to "bytes" instead, then you should
> > not see that error.  Are you sure you did that?  If so, can you write a
> > small demo program which produces this error, even when the protobuf type
> > contains no "string" fields, and send it to me?
> >
> > On Fri, Mar 20, 2009 at 11:16 AM, <saad.a...@gmail.com> wrote:
> >
> > > I am not very experienced programmer, but I will try to explain whats
> > > happening:
> >
> > > I have books titles database in protocol buffer format. The message
> > > Title has fields like:
> > > optional string title = 1;
> > > optional string description = 2;
> > > optional string isbn = 3
> > > ...
> > > ...
> >
> > > When I convert my mysql data to pb, i use python and store it using
> > > title_str_pb = title.SerializeToString()
> >
> > > When I read back the titles in python, everything works fine. Like:
> > > t = Title()
> > > t.ParseFromString(title_str_pb)
> > > title = t.title
> > > description = t.description
> >
> > > But now I want to use this protocol buffer data in c++. like:
> > > Title t;
> > > t.ParseFromString(title_str_pb)
> >
> > > and I get error:
> > > Encountered string containing invalid UTF-8 data while parsing
> > > protocol buffer. Strings must contain only UTF-8; use the 'bytes' type
> > > for raw bytes.
> >
> > > I changed the string type to bytes type, then also I get the same
> > > error.
> >
> > > I have a million book records stored in pb format. I don't want to
> > > loose my data. Can somebody help please. As an alternative I will
> > > restore my data back using python. But I want to use it in c++.
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to