Message forwarding and partial parsing
Hi, I am wondering about the best way of forwarding received protocol buffer messages from one entity to another without having to parse the entire message just to serialize it again. My scenario is the following: I have a process A connected to process B using local IPC. B is in turn connected to process C on another machine using tcp and C is connected to process D using local IPC. I.e A-B-C-D. Process A wants to send messages to process B, C and D, to control the operations. Process A has no concept of tcp/ip and uses process B to forward messages to the 'C' processes running on other machines (each machine has a unique id). Each machine might have several 'D' processes running (each has a unique id). The basic message is similar to this: message MyMessage { extensions 100 to max; } and several messages that would make sense to B, C and D are declared similar to this: message MyExtension { extend MyMessage { optional MyExtension my_extension = 100; } ... } In a naive implementation, a message sent from A to D would involve the message being serialized by A, deserialized by B, serialized by B, deserialized by C, serialized by C and then finally being deserialized by D. This seems a bit too much to me, so I am hoping that anyone would be willing to comment on the possible solutions to routing messages, while minimizing unnecessary serialization/deserialization overhead. I have several options: Option 1: Extend the MyMessage message with destination information like this: message MyMessage { optional MyId destination_id; ... } When process B deserializes the message it can look at the destination_id to decide where to forward the message. The problem with this would be that some extensions would be recognized by process B even though they are aimed at process C, which I'm _guessing_ would mean that the extension would be parsed and then encoded again when the message is forwarded. So I'm thinking this approach is out. Option 2: Extend MyMessage with an internal message: message MyMessage { optional MyId destination_id = 1; optional bytes internal_message = 2; ... } Now process B would not have to parse the internal message. However, process A would have to first serialize the message to a byte sequence, then insert that into another message and serialize that. This seems awkward to me. Option 3: Extend the header sent on the channel with more information. Right now I am sending the length of the message first, then the actual serialized message. This could be extended into more of a header with the destination id as well. Sounds like a protocol buffer message would be suitable for use as a header... something like this: message MyHeader { optional destination_id = 1; required uint32 message_length; } On the wire, I would still need to first send the length of the header (or possibly make sure that the header has a fixed length), then the serialized header followed by the serialized message. Process B could then simply forward the bytes in the message without having to parse the contents. Of these three options, I'm thinking that option 3 is the correct way to go. Am I missing some functionality provided by protocol buffers (such as the ability to skip parsing extensions even if they are recognized or similar or only parse as much as needed)? Am I missing any problems? On a somewhat related note, is it possible to parse a partially transmitted message and continue parsing at a later time when more data is available? I.e. since I cannot guarantee that all data for a message is available directly, do I need to buffer data until I know that I have the entire message (which is what I do today) before allowing protocol buffers to parse it? Example: the message X is sent on the wire consisting of a number of fields. It is delivered on the other side of the connection as a series of chunks. For instance, in a theoretical scenario the first chunk could contain the first field descriptor, the first data value and half the second field descriptor. The next chunk could contain the second half of the second field descriptor and half the second data value and the last chunk could contain the rest of the message. Can I allow protocol buffers to parse the chunks of data as they come in without having to worry about half field descriptors, half data values and so on? I see that there are ParsePartialFrom... functions for messages, but the documentation states that the difference between these and the regular ParseFrom... functions are that they allow required fields to be missing. I assume that this means that there is no partial parse functionality in the sense that partial field descriptors or partial values can be continued at a later time? Sorry for a lengthy post... Any comments on either problem are appreciated! Cheers, V --~--~-~--~~~---~--~~ You received this message because you are subscribed to the
Protocol Buffers: Protocol message end-group tag did not match expected tag.
Hi there, I recently started working with Protocol Buffers. I used the Addressbook Example to become acquainted with the PBs. (http:// code.google.com/intl/de-DE/apis/protocolbuffers/docs/ javatutorial.html) The only difference is that I use an OutputStream to write the address book instance (in the example they used a FileOutputStream). Everything works fine, I compiled the proto file and imported it to my Java project and even that compiles without errors, but when my client code tries to get (parse) the addressbook instance from the server the following merror message appears: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag. at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag (InvalidProtocolBufferException.java:73) at com.google.protobuf.CodedInputStream.checkLastTagWas (CodedInputStream.java:105) at com.google.protobuf.AbstractMessageLite$Builder.mergeFrom (AbstractMessageLite.java:202) at com.google.protobuf.AbstractMessage$Builder.mergeFrom (AbstractMessage.java:664) at protoc.Example$AddressBook.parseFrom(Example.java:929) at protoc.Proto.createConnection(Proto.java:33) at protoc.Proto.main(Proto.java:24) That's the code: (Server) ... Person.Builder person = Person.newBuilder(); person.setName(Peter); person.setId(5); AddressBook.Builder addressbook = AddressBook.newBuilder(); addressbook.addPerson(person.build()); addressbook.build().writeTo(client.getOutputStream()); // client is a Socket object client.close(); (Client) ... public static void createConnection() { server = null; try { server = new Socket(192.168.1.30, 4141); System.out.println(Connected to server); AddressBook mission2 = AddressBook.parseFrom(server.getInputStream ()); } catch (UnknownHostException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } closeConnection(); } The proto file ... package protoc; option java_package = protoc; option java_outer_classname = Example; message Person { required string name = 1; required int32 id = 2; optional string email = 3; enum PhoneType { MOBILE = 0; HOME = 1; WORK = 2; } message PhoneNumber { required string number = 1; optional PhoneType type = 2 [default = HOME]; } repeated PhoneNumber phone = 4; } message AddressBook { repeated Person person = 1; } Can anyone tell me what is wrong? I can't find my mistake ... :( Thank you very much in advance! Ramon --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Re: Save multiple messages to a single file C++
Thanks a lot Kenton. I went for the second option. If you are storing a lot of data in pb format, wrapping around with an outer message makes you parse all at once. it is like DOM vs SAX for XML I guess. thanks again! p. On Oct 5, 10:17 pm, Kenton Varda ken...@google.com wrote: The easiest solution is to create an outer message like: message NodeList { repeated Node node = 1; } then write a single NodeList to the file, and read it back again as a single NodeList. If you really need to read/write individual messages separately (because the file is too big to read/write all at once), see: http://code.google.com/apis/protocolbuffers/docs/techniques.html#stre... On Mon, Oct 5, 2009 at 9:08 PM, Petko petko.bogda...@gmail.com wrote: Hello, Is there a way to save multiple messages to a single file and then read it back? I want to be able to do something like this: message Node { required ind64 id = 1; } main.cc: fstream out(test, ios::out | ios::binary); for (int i = 0; i 20 ; i ++ ) { Node n; string s; n.set_id(i); // SAVE n TO out } out.close(); fstream in(test, ios::in | ios::binary); while (!in.eof()) { Node n; //READ ONE NODE FROM in TO n } in.close(); What should I use in the positions: // SAVE n TO out and //READ ONE NODE FROM in TO n ? Any comments and ideas welcome. --petko --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Re: Save multiple messages to a single file C++
This whole topic - how to save multiple messages to a single stream - comes up frequently enough that I'm starting to think there should be a more flexible answer than what's in the FAQ. Declaring a one-byte End of Object seems like it would be one way to handle it. Whatever it is, it should keep in mind that protocol buffers may not be coming from files, but streaming from sockets (e.g. tcp). If nothing else, I think this should be addressed for the sake of consistency. I've been encoding a 32-bit length before my my protocol buffers... which works just fine but like I said, consistency would be helpful. Just my $0.02. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Re: Message forwarding and partial parsing
On Wed, Oct 7, 2009 at 5:46 AM, villintehaspam villintehas...@gmail.comwrote: I am wondering about the best way of forwarding received protocol buffer messages from one entity to another without having to parse the entire message just to serialize it again. It looks like you've figured out all the major options. One thing I'd encourage you to do if you haven't already is actually profile your system to find out if repeated parsing and serialization is a real problem for you. It may not be a real problem in practice even if it feels wrong. Of these three options, I'm thinking that option 3 is the correct way to go. All three options are reasonable. Option 3 is the most complicated solution, but probably the most performant. Am I missing some functionality provided by protocol buffers (such as the ability to skip parsing extensions even if they are recognized or similar or only parse as much as needed)? Am I missing any problems? If you are using C++, then all compiled-in extensions will be eagerly parsed. If you only compile-in the extensions that each process actually cares about, that solves your problem. In Java you provide an ExtensionRegistry listing extensions you care about, so it's trivial to include only the ones you want. I'm guessing you aren't using Java. On a somewhat related note, is it possible to parse a partially transmitted message and continue parsing at a later time when more data is available? Not without blocking. The library is designed to parse an entire message at once. Allowing partial parsing (without blocking) would be quite complicated. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
$500 In A Single Day!
Here's just a taste of what you'll find out... How to go from zero to $1,000 in 7 days http://www.easyinternetbiz.net/index.html --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Message::ByteSize() wrong
The value returned by Message::ByteSize() does not match the actually number of bytes that are consumed after writing a message to a stream. Example: some_message m; /* ... populate m ... */ size_t len = m.ByteSize(); int pos = boost::iostreams::position_to_offset(stream.tellp()); /* save the current position of the stream */ m.SerializeToOstream(stream); /* and now this assert will fail */ assert(pos + len == boost::iostreams::position_to_offset (stream.tellp()); Maybe I'm missing something here, but shouldn't the value returned by ByteSize() be the same as the actual number of bytes written to the stream? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Re: Message::ByteSize() wrong
Here you go...attached is an example that fails quite reliably (for me). Compile like so: mkdir build cd build cmake ../ make ./test-pb On Wed, Oct 7, 2009 at 4:49 PM, Kenton Varda ken...@google.com wrote: ByteSize() definitely returns the right value -- if it didn't, tons of stuff would be broken. Can you provide a complete example program that demonstrates your problem? On Wed, Oct 7, 2009 at 4:01 PM, Brenden Matthews bren...@diddyinc.comwrote: The value returned by Message::ByteSize() does not match the actually number of bytes that are consumed after writing a message to a stream. Example: some_message m; /* ... populate m ... */ size_t len = m.ByteSize(); int pos = boost::iostreams::position_to_offset(stream.tellp()); /* save the current position of the stream */ m.SerializeToOstream(stream); /* and now this assert will fail */ assert(pos + len == boost::iostreams::position_to_offset (stream.tellp()); Maybe I'm missing something here, but shouldn't the value returned by ByteSize() be the same as the actual number of bytes written to the stream? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~--- pbtest.tar.bz2 Description: BZip2 compressed data
Re: Message::ByteSize() wrong
This example is too big for me to debug. Can't you reproduce this with a 10-line program + proto file? On Wed, Oct 7, 2009 at 6:11 PM, Brenden Matthews bren...@diddyinc.comwrote: Oops, had some bad math in that last sample. This is more correct (but still fails). On Wed, Oct 7, 2009 at 5:33 PM, Brenden Matthews bren...@diddyinc.comwrote: Here you go...attached is an example that fails quite reliably (for me). Compile like so: mkdir build cd build cmake ../ make ./test-pb On Wed, Oct 7, 2009 at 4:49 PM, Kenton Varda ken...@google.com wrote: ByteSize() definitely returns the right value -- if it didn't, tons of stuff would be broken. Can you provide a complete example program that demonstrates your problem? On Wed, Oct 7, 2009 at 4:01 PM, Brenden Matthews bren...@diddyinc.comwrote: The value returned by Message::ByteSize() does not match the actually number of bytes that are consumed after writing a message to a stream. Example: some_message m; /* ... populate m ... */ size_t len = m.ByteSize(); int pos = boost::iostreams::position_to_offset(stream.tellp()); /* save the current position of the stream */ m.SerializeToOstream(stream); /* and now this assert will fail */ assert(pos + len == boost::iostreams::position_to_offset (stream.tellp()); Maybe I'm missing something here, but shouldn't the value returned by ByteSize() be the same as the actual number of bytes written to the stream? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---
Re: Message::ByteSize() wrong
On Wed, Oct 7, 2009 at 6:48 PM, Henner Zeller h.zel...@acm.org wrote: Still haven't run it (I only seem to have a too old cmake; a simple Makefile with a .proto and .cc file showing the problem would be better. Please strip down the example if someone should help debugging it ;) ) Anyway, it seems that you in write_message() to write a header with some magic number and the size. stream magic8; stream (uint8_t)data_types::PB_DATA; stream _htole32(message.GetCachedSize()); .. but wouldn't this write the output in decimal instead of binary that you intend ? So it will not be exactly 4 bytes. So better write this in binary ;) (and BTW, it is not a good idea to use some system macros (such as _htole32() .. better use htonl()) Indeed, except I need it to work on Windows with MSVC and MinGW. I decided after much fussing around to just roll my own. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~--~~~~--~~--~--~---