Re: XML to Protocol Buffers converter
Cool. New syntax I didn't know about! I think this would be useful for converting the other way around (Proto-to-XML). On Sep 28, 6:05 pm, Kenton Varda ken...@google.com wrote: Interesting. Another way to do this would be to write code based on protobuf reflection and custom options, so you could have a proto like: message Foo { optional int32 i = 1 [(xml_disposition) = ATTRIBUTE]; optional Bar bar = 2 [(xml_disposition) = ELEMENT]; } On Mon, Sep 28, 2009 at 6:14 AM, sim simon.we...@gmail.com wrote: Hi all. Would anybody be interested in an XML to Protocol Buffers converter if it were opened up to the community? I have an XML stylesheet that transforms an annotated XSD set into a Java class that uses JAXB and the Protocol Buffers Java API to convert XML documents into either text or binary mode Protocol Buffers messages. The XSD annotations define the mappings from XSD elements to Proto messages (although at present staying close to a 1:1 mapping is probably safest). Some fancy XSD features are not yet supported but the usual complexTypes, simpleTypes, elements, and enumerations work. Simon Weeks --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
protobuf-c compilation errors
Hello, I'm trying to compile protobuf-c and keep getting errors I didn't when I tried compiling on other machines. Obviously the preliminary protobuf compilation succeeded. I should also mention that the previous successful installations were on a 64- bit machine, where as the current machine I'm trying to install on is 32-bit. I'm trying to install using a predefined prefix. The protobuf installation was performed to the same target directory. When I run: ./configure --prefix=prefix dir I get this error - checking google/protobuf/stubs/common.h usability... no checking google/protobuf/stubs/common.h presence... no checking for google/protobuf/stubs/common.h... no configure: error: ERROR: protobuf headers are required. You must either install protobuf from google, or if you have it installed in a custom location you must add '-Iincludedir' to CXXFLAGS and '-Llibdir' to LDFLAGS. So I tried running this instead: ./configure CXXFLAGS=-Iprefix dir/include LDFLAGS=-L=prefix dir/ lib --prefix==prefix dir The operation succeeds although with an alarming warning: configure: WARNING: google/protobuf/stubs/common.h: accepted by the compiler, rejected by the preprocessor! configure: WARNING: google/protobuf/stubs/common.h: proceeding with the compiler's result When I try to run make now I fail on this error: make[2]: Entering directory `install dir/new_installs/protobuf- c-0.11/src/test' /bin/sh ../../libtool --tag=CXX --mode=link g++ -Iprefix dir/ include -Lprefix dir/lib -o cxx-generate-packed-data cxx-generate- packed-data.o test-full.pb.o -lprotobuf g++ -Iprefix dir/include -o cxx-generate-packed-data cxx-generate- packed-data.o test-full.pb.o -Lprefix dir/lib prefix dir//lib/ libprotobuf.so -Wl,--rpath -Wl,prefix dir//lib -Wl,--rpath - Wl,prefix dir//lib test-full.pb.o: In function `google::protobuf::GoogleOnceInit(int*, void (*)())': test-full.pb.cc:(.text._ZN6google8protobuf14GoogleOnceInitEPiPFvvE [google::protobuf::GoogleOnceInit(int*, void (*)())]+0x14): undefined reference to `pthread_once' collect2: ld returned 1 exit status make[2]: *** [cxx-generate-packed-data] Error 1 I'd appreciate any ideas. I'm at a loss here. Some technical details: protobuf 2.1.0 protobuf-c 0.11 OS RedHat 5.2 i386 gcc 4.3.1 Thank you very much, Aviad --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
ProtoBuf.Net: Generic reader
Is there a way to generically read a protoBuf byte[] and extract a field tag/name to value mapping? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Can serialized messages be used reliably as keys?
Hi, Can serialized messages be used reliably as keys? In other words, is it guaranteed that... - Two equal messages will always generate equal byte sequences? (Are fields always written in the same order?) - Two unequal messages will always generate unequal byte sequences? (Are tag identifiers enough to delimit variable length fields from accidentally producing equal byte sequences?) I have a feeling that the answer is no. For example, given a proto with two fields, both variable length int64 types, it seems that two unequal messages could, by chance, generate the same byte sequence: [1 byte tag] [3 byte value] [1 byte tag] [2 byte value] = 7 bytes [1 byte tag] [2 byte value] [1 byte tag] [3 byte value] = 7 bytes [1 byte tag] [6 byte value] = 7 bytes ... etc. If those 7 bytes just happen to be equal, then the serialized messages can NOT be used reliably as keys. Thoughts? Thank you. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Re: Can serialized messages be used reliably as keys?
On Sep 29, 8:22 pm, alopecoid alopec...@gmail.com wrote: Can serialized messages be used reliably as keys? In other words, is it guaranteed that... - Two equal messages will always generate equal byte sequences? (Are fields always written in the same order?) - Two unequal messages will always generate unequal byte sequences? (Are tag identifiers enough to delimit variable length fields from accidentally producing equal byte sequences?) I have a feeling that the answer is no. For example, given a proto with two fields, both variable length int64 types, it seems that two unequal messages could, by chance, generate the same byte sequence: [1 byte tag] [3 byte value] [1 byte tag] [2 byte value] = 7 bytes [1 byte tag] [2 byte value] [1 byte tag] [3 byte value] = 7 bytes [1 byte tag] [6 byte value] = 7 bytes ... etc. If those 7 bytes just happen to be equal, then the serialized messages can NOT be used reliably as keys. Given that the serialized bytes have to be able to *deserialize* back to the original messages, surely if those original messages aren't equal, the serialized forms would have to be different too - assuming we're talking about the same message type. (Two messages of different types could serialize to the same data, admittedly.) The Java serialized form does serialize all the fields in order, I believe. Jon --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Re: Can serialized messages be used reliably as keys?
On Tue, Sep 29, 2009 at 12:22 PM, alopecoid alopec...@gmail.com wrote: Hi, Can serialized messages be used reliably as keys? In other words, is it guaranteed that... - Two equal messages will always generate equal byte sequences? (Are fields always written in the same order?) Only if: 1) The implementation you are using writes fields in canonical order. All official implementations do this, and probably most unofficial ones. However, implementations are technically allowed to write fields in any order. 2) There are no unknown fields in the message. Your message may have unknown fields if you originally parsed it off the wire and the sender is a newer binary that knows about new fields recently added to the .proto file. In C++ you can get rid of all unknown fields in a message by calling the DiscardUnknownFields() method. If possible, I would recommend designing your application such that it only requires that equal messages have the same serialization *most* of the time. For example, if you were designing a cache where the cache key is the hash of a serialized message, then the worst that can happen if two equal messages had different serializations is that you'd perform the same operation twice rather than hitting cache. As long as this is relatively rare, it's no big deal. - Two unequal messages will always generate unequal byte sequences? As Jon said, this clearly has to be true. If two messages could have the same serialization, then how would the parser know which one to produce when parsing? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Re: Can serialized messages be used reliably as keys?
On Tue, Sep 29, 2009 at 12:41 PM, alopecoid alopec...@gmail.com wrote: But, as in my example, that doesn't seem to be the case (necessarily). Again, for example, let's say you have two messages, both of the same type. The proto defines two optional fields, both of type variable int64. Say message A poopulates both optional fields: [1 byte tag] [3 byte value] [1 byte tag] [2 byte value] = 7 bytes And message B populates only one optional field: [1 byte tag] [6 byte value] = 7 bytes Couldn't these generate, by chance, the same 7 bytes? No. Yes, using deserialize will correctly parse two unequal messages, but if you look at the raw serialized byte sequences, they could actually be the same. That makes no sense. If the bytes were the same, how would deserializing them be able to produce unequal messages? If you must know the details, in the varint encoding, the upper bit of each byte is used to indicate whether there are more bytes in the value. So, in a 3-byte varint, the first two bytes have the upper bit set, but the last byte does not. So obviously a 3-byte varint cannot start with the same bytes as a 6-byte varint, because in a 6-byte varint the third byte would have the upper bit set. However, these details really aren't necessary to answer the question. The same bytes, passed to the same parsing function, will produce the same output, regardless of how the encoding works. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Re: ProtoBuf.Net: Generic reader
Well... how complex is the data? Reflection seems the most obvious choice if it is available - especially since that will work well with things like PropertyGrid (if you are in winforms). If I understood the scenario better I may have more ideas... Marc On Sep 29, 6:39 pm, test.f...@nomail.please test.f...@gmail.com wrote: Bummer.. It would've been a great feature. I'm faced with displaying several complex nested protos and the simplest way would've been a 2 column list view that was populated by a generic proto reader. I'm looking at Jon's solution, but I really don't want to have to implement and maintain 2 protobufs for each object in the client, just for display purposes. Thanks On Sep 29, 10:58 am, Marc Gravell marc.grav...@gmail.com wrote: In protobuf-net? No. You could deserialize into the expected type and use reflection, though. It is perhaps something I could consider should I find time though - presumably in the non-type based branch (experimental; unstable; incomplete...). Jon's version may have other options here? dotnet-protobufs On Sep 29, 4:03 pm, test.f...@nomail.please test.f...@gmail.com wrote: Is there a way to generically read a protoBuf byte[] and extract a field tag/name to value mapping? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Re: Can serialized messages be used reliably as keys?
On Tue, Sep 29, 2009 at 12:41, alopecoid alopec...@gmail.com wrote: Given that the serialized bytes have to be able to *deserialize* back to the original messages, surely if those original messages aren't equal, the serialized forms would have to be different too - assuming we're talking about the same message type But, as in my example, that doesn't seem to be the case (necessarily). Again, for example, let's say you have two messages, both of the same type. The proto defines two optional fields, both of type variable int64. Say message A poopulates both optional fields: [1 byte tag] [3 byte value] [1 byte tag] [2 byte value] = 7 bytes And message B populates only one optional field: [1 byte tag] [6 byte value] = 7 bytes The varints are self synchronizing http://code.google.com/apis/protocolbuffers/docs/encoding.html#varints i.e. the first bit is always set in the bytes except for the last one. So the 3 byte value will have something like 1xxx 1xxx 0xxx while the 6 byte value will have have a msb set to 1 at the third byte. So they will always be different. So yes, they will be different. As Jon said: the protocol decoder needs to be able to decode it properly - a confusion between a (3byte + tag + 2 byte varint) vs. (6 byte varint) would not work. So two different messages of the same message type are always different (however two messages of different type could theoretically encode two the same). The thing you have to worry about more is the _sequence_ in which the tags are encoded. The decoder does not care in which sequence the fields are encoded, so it could be that messages with the same content can be encoded in different ways. However the encoding in the Google implementation guarantees that the fields are always in a consistent order (I guess too many people relied on the fact that messages can be used as a key/can be hashed). -h Couldn't these generate, by chance, the same 7 bytes? Yes, using deserialize will correctly parse two unequal messages, but if you look at the raw serialized byte sequences, they could actually be the same. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---