Re: XML to Protocol Buffers converter

2009-09-29 Thread sim

Cool. New syntax I didn't know about!  I think this would be useful
for converting the other way around (Proto-to-XML).

On Sep 28, 6:05 pm, Kenton Varda ken...@google.com wrote:
 Interesting.

 Another way to do this would be to write code based on protobuf reflection
 and custom options, so you could have a proto like:
   message Foo {
     optional int32 i = 1 [(xml_disposition) = ATTRIBUTE];
     optional Bar bar = 2 [(xml_disposition) = ELEMENT];
   }

 On Mon, Sep 28, 2009 at 6:14 AM, sim simon.we...@gmail.com wrote:

  Hi all. Would anybody be interested in an XML to Protocol Buffers
  converter if it were opened up to the community?  I have an XML
  stylesheet that transforms an annotated XSD set into a Java class that
  uses JAXB and the Protocol Buffers Java API to convert XML documents
  into either text or binary mode Protocol Buffers messages.  The XSD
  annotations define the mappings from XSD elements to Proto messages
  (although at present staying close to a 1:1 mapping is probably
  safest). Some fancy XSD features are not yet supported but the usual
  complexTypes, simpleTypes, elements, and enumerations work.

  Simon Weeks


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



protobuf-c compilation errors

2009-09-29 Thread zavi

Hello,

I'm trying to compile protobuf-c and keep getting errors I didn't when
I tried compiling on other machines.
Obviously the preliminary protobuf compilation succeeded. I should
also mention that the previous successful installations were on a 64-
bit machine, where as the current machine I'm trying to install on is
32-bit.

I'm trying to install using a predefined prefix. The protobuf
installation was performed to the same target directory.

When I run:
./configure --prefix=prefix dir
I get this error -
checking google/protobuf/stubs/common.h usability... no
checking google/protobuf/stubs/common.h presence... no
checking for google/protobuf/stubs/common.h... no
configure: error:
ERROR: protobuf headers are required.
You must either install protobuf from google,
or if you have it installed in a custom location
you must add '-Iincludedir' to CXXFLAGS
and '-Llibdir' to LDFLAGS.

So I tried running this instead:
./configure CXXFLAGS=-Iprefix dir/include LDFLAGS=-L=prefix dir/
lib --prefix==prefix dir

The operation succeeds although with an alarming warning:
configure: WARNING: google/protobuf/stubs/common.h: accepted by the
compiler, rejected by the preprocessor!
configure: WARNING: google/protobuf/stubs/common.h: proceeding with
the compiler's result

When I try to run make now I fail on this error:
make[2]: Entering directory `install dir/new_installs/protobuf-
c-0.11/src/test'
/bin/sh ../../libtool --tag=CXX --mode=link g++  -Iprefix dir/
include  -Lprefix dir/lib -o cxx-generate-packed-data  cxx-generate-
packed-data.o test-full.pb.o -lprotobuf
g++ -Iprefix dir/include -o cxx-generate-packed-data cxx-generate-
packed-data.o test-full.pb.o  -Lprefix dir/lib prefix dir//lib/
libprotobuf.so   -Wl,--rpath -Wl,prefix dir//lib -Wl,--rpath -
Wl,prefix dir//lib
test-full.pb.o: In function `google::protobuf::GoogleOnceInit(int*,
void (*)())':
test-full.pb.cc:(.text._ZN6google8protobuf14GoogleOnceInitEPiPFvvE
[google::protobuf::GoogleOnceInit(int*, void (*)())]+0x14): undefined
reference to `pthread_once'
collect2: ld returned 1 exit status
make[2]: *** [cxx-generate-packed-data] Error 1

I'd appreciate any ideas. I'm at a loss here.

Some technical details:
protobuf  2.1.0
protobuf-c 0.11
OS RedHat 5.2 i386
gcc 4.3.1

Thank you very much,
Aviad

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



ProtoBuf.Net: Generic reader

2009-09-29 Thread test.f...@nomail.please

Is there a way to generically read a protoBuf byte[] and extract a
field tag/name to value mapping?
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Can serialized messages be used reliably as keys?

2009-09-29 Thread alopecoid

Hi,

Can serialized messages be used reliably as keys?

In other words, is it guaranteed that...

- Two equal messages will always generate equal byte sequences?
(Are fields always written in the same order?)

- Two unequal messages will always generate unequal byte sequences?
(Are tag identifiers enough to delimit variable length fields from
accidentally producing equal byte sequences?)

I have a feeling that the answer is no. For example, given a proto
with two fields, both variable length int64 types, it seems that two
unequal messages could, by chance, generate the same byte sequence:

[1 byte tag] [3 byte value] [1 byte tag] [2 byte value] = 7 bytes
[1 byte tag] [2 byte value] [1 byte tag] [3 byte value] = 7 bytes
[1 byte tag] [6 byte value] = 7 bytes
... etc.

If those 7 bytes just happen to be equal, then the serialized messages
can NOT be used reliably as keys.

Thoughts?

Thank you.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Can serialized messages be used reliably as keys?

2009-09-29 Thread Jon Skeet

On Sep 29, 8:22 pm, alopecoid alopec...@gmail.com wrote:
 Can serialized messages be used reliably as keys?

 In other words, is it guaranteed that...

 - Two equal messages will always generate equal byte sequences?
 (Are fields always written in the same order?)

 - Two unequal messages will always generate unequal byte sequences?
 (Are tag identifiers enough to delimit variable length fields from
 accidentally producing equal byte sequences?)

 I have a feeling that the answer is no. For example, given a proto
 with two fields, both variable length int64 types, it seems that two
 unequal messages could, by chance, generate the same byte sequence:

 [1 byte tag] [3 byte value] [1 byte tag] [2 byte value] = 7 bytes
 [1 byte tag] [2 byte value] [1 byte tag] [3 byte value] = 7 bytes
 [1 byte tag] [6 byte value] = 7 bytes
 ... etc.

 If those 7 bytes just happen to be equal, then the serialized messages
 can NOT be used reliably as keys.

Given that the serialized bytes have to be able to *deserialize* back
to the original messages, surely if those original messages aren't
equal, the serialized forms would have to be different too - assuming
we're talking about the same message type. (Two messages of different
types could serialize to the same data, admittedly.)

The Java serialized form does serialize all the fields in order, I
believe.

Jon

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Can serialized messages be used reliably as keys?

2009-09-29 Thread Kenton Varda
On Tue, Sep 29, 2009 at 12:22 PM, alopecoid alopec...@gmail.com wrote:


 Hi,

 Can serialized messages be used reliably as keys?

 In other words, is it guaranteed that...

 - Two equal messages will always generate equal byte sequences?
 (Are fields always written in the same order?)


Only if:

1) The implementation you are using writes fields in canonical order.  All
official implementations do this, and probably most unofficial ones.
 However, implementations are technically allowed to write fields in any
order.

2) There are no unknown fields in the message.  Your message may have
unknown fields if you originally parsed it off the wire and the sender is a
newer binary that knows about new fields recently added to the .proto file.
 In C++ you can get rid of all unknown fields in a message by calling the
DiscardUnknownFields() method.

If possible, I would recommend designing your application such that it only
requires that equal messages have the same serialization *most* of the time.
 For example, if you were designing a cache where the cache key is the hash
of a serialized message, then the worst that can happen if two equal
messages had different serializations is that you'd perform the same
operation twice rather than hitting cache.  As long as this is relatively
rare, it's no big deal.


 - Two unequal messages will always generate unequal byte sequences?


As Jon said, this clearly has to be true.  If two messages could have the
same serialization, then how would the parser know which one to produce when
parsing?

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Can serialized messages be used reliably as keys?

2009-09-29 Thread Kenton Varda
On Tue, Sep 29, 2009 at 12:41 PM, alopecoid alopec...@gmail.com wrote:

 But, as in my example, that doesn't seem to be the case (necessarily).
 Again, for example, let's say you have two messages, both of the same
 type. The proto defines two optional fields, both of type variable
 int64.

 Say message A poopulates both optional fields:
 [1 byte tag] [3 byte value] [1 byte tag] [2 byte value] = 7 bytes

 And message B populates only one optional field:
 [1 byte tag] [6 byte value] = 7 bytes

 Couldn't these generate, by chance, the same 7 bytes?


No.


 Yes, using
 deserialize will correctly parse two unequal messages, but if you look
 at the raw serialized byte sequences, they could actually be the same.


That makes no sense.  If the bytes were the same, how would deserializing
them be able to produce unequal messages?

If you must know the details, in the varint encoding, the upper bit of each
byte is used to indicate whether there are more bytes in the value.  So, in
a 3-byte varint, the first two bytes have the upper bit set, but the last
byte does not.  So obviously a 3-byte varint cannot start with the same
bytes as a 6-byte varint, because in a 6-byte varint the third byte would
have the upper bit set.

However, these details really aren't necessary to answer the question.  The
same bytes, passed to the same parsing function, will produce the same
output, regardless of how the encoding works.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: ProtoBuf.Net: Generic reader

2009-09-29 Thread Marc Gravell

Well... how complex is the data? Reflection seems the most obvious
choice if it is available - especially since that will work well with
things like PropertyGrid (if you are in winforms). If I understood the
scenario better I may have more ideas...

Marc

On Sep 29, 6:39 pm, test.f...@nomail.please test.f...@gmail.com
wrote:
 Bummer.. It would've been a great feature.  I'm faced with displaying
 several complex nested protos and the simplest way would've been a 2
 column list view that was populated by a generic proto reader.

 I'm looking at Jon's solution, but I really don't want to have to
 implement and maintain 2 protobufs for each object in the client, just
 for display purposes.

 Thanks

 On Sep 29, 10:58 am, Marc Gravell marc.grav...@gmail.com wrote:



  In protobuf-net? No. You could deserialize into the expected type and
  use reflection, though. It is perhaps something I could consider
  should I find time though - presumably in the non-type based branch
  (experimental; unstable; incomplete...).

  Jon's version may have other options here? dotnet-protobufs

  On Sep 29, 4:03 pm, test.f...@nomail.please test.f...@gmail.com
  wrote:

   Is there a way to generically read a protoBuf byte[] and extract a
   field tag/name to value mapping?
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Can serialized messages be used reliably as keys?

2009-09-29 Thread Henner Zeller

On Tue, Sep 29, 2009 at 12:41, alopecoid alopec...@gmail.com wrote:

 Given that the serialized bytes have to be able to *deserialize* back
 to the original messages, surely if those original messages aren't
 equal, the serialized forms would have to be different too - assuming
 we're talking about the same message type

 But, as in my example, that doesn't seem to be the case (necessarily).
 Again, for example, let's say you have two messages, both of the same
 type. The proto defines two optional fields, both of type variable
 int64.

 Say message A poopulates both optional fields:
 [1 byte tag] [3 byte value] [1 byte tag] [2 byte value] = 7 bytes

 And message B populates only one optional field:
 [1 byte tag] [6 byte value] = 7 bytes

The varints are self synchronizing
  http://code.google.com/apis/protocolbuffers/docs/encoding.html#varints
i.e. the first bit is always set in the bytes except for the last one.
So the 3 byte value will have something like 1xxx 1xxx
0xxx while the 6 byte value will have have a msb set to 1 at the
third byte. So they will always be different.

So yes, they will be different. As Jon said: the protocol decoder
needs to be able to decode it properly - a confusion between a (3byte
+ tag + 2 byte varint) vs. (6 byte varint) would not work. So two
different messages of the same message type are always different
(however two messages of different type could theoretically encode two
the same).

The thing you have to worry about more is the _sequence_ in which the
tags are encoded. The decoder does not care in which sequence the
fields are encoded, so it could be that messages with the same content
can be encoded in different ways.

However the encoding in the Google implementation guarantees that the
fields are always in a consistent order (I guess too many people
relied on the fact that messages can be used as a key/can be hashed).

-h


 Couldn't these generate, by chance, the same 7 bytes? Yes, using
 deserialize will correctly parse two unequal messages, but if you look
 at the raw serialized byte sequences, they could actually be the same.
 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---