Hi Kenton, Let me start off by describing my usage scenario.
I'm interested in using protobuf to implement the messaging protocol between clients and servers of a distributed messaging system. For simplicity, lets pretend the that protocol is similar to xmpp and that there are severs which handle delivering messages to and from clients. In this case, the server clearly is not interested in the meat of the messages being sent around. It is typically only interested routing data. In this case, deferred decoding provides a substantial win. Furthermore, when the server passes on the message to the consumer, he does not need to encode the message again. For important messages, the server may be configured to persist those messages as they come in, so the server would once again benefit from not having to encode the message yet again. I don't think the user could implement those optimizations on their own without support from the protobuf implementation. At least not as efficiently and elegantly. You have to realize that the 'free encoding' holds true for even nested message structures in the message. So lets say that the user aggregating data from multiple source protobuf messages and is picking data out of it and placing it into a new protobuf message that then gets encoded. Only the outer message would need encoding, the inner nested element which were picked from the other buffers would benefit from the 'free encoding'. The overhead of the lazy decoding is exactly 1 extra "if (bean == null)" statement, which is probably cheaper than most virtual dispatch invocations. But if you're really trying to milk the performance out of your app, you should just call buffer.copy() to get the bean backing the buffer. All get operations on the bean do NOT have the overhead. Regarding threading, since the buffer is immutable and decoding is idempotent, you don't really need to worry about thread safety. Worst case scenario is that 2 threads decode the same buffer concurrently and then set the bean field of the buffer. Since the resulting beans are equal, in most cases it would not really matter which thread wins when they overwrite the bean field. As for up front validation, in my use case, deferring validation is a feature. The less work the server has to do the better since, it will help scale vertically. I do agree that in some use cases it would be desirable to fully validate up front. I think it should be up to the application to decide if it wants up front validation or deferred decoding. For example, it would be likely that the client of the messaging protocol would opt for up front validation. On the other hand, the server would use deferred decoding. It's definitely a performance versus consistency trade-off. I think that once you make 'free encoding', and deferred decoding an option, users that have high performance use cases will design their application so that they can exploit those features as much as possible. -- Regards, Hiram Blog: http://hiramchirino.com Open Source SOA http://fusesource.com/ On Sep 18, 6:43 pm, Kenton Varda <ken...@google.com> wrote: > Hmm, your bean and buffer classes sound conceptually equivalent to my > builder and message classes. > Regarding lazy parsing, this is certainly something we've considered before, > but it introduces a lot of problems: > > 1) Every getter method must now first check whether the message is parsed, > and parse it if not. Worse, for proper thread safety it really needs to > lock a mutex while performing this check. For a fair comparison of parsing > speed, you really need another benchmark which measures the speed of > accessing all the fields of the message. I think you'll find that parsing a > message *and* accessing all its fields is significantly slower with the lazy > approach. Your approach might be faster in the case of a very deep message > in which the user only wants to access a few shallow fields, but I think > this case is relatively uncommon. > > 2) What happens if the message is invalid? The user will probably expect > that calling simple getter methods will not throw parse exceptions, and > probably isn't in a good position to handle these exceptions. You really > want to detect parse errors at parse time, not later on down the road. > > We might add lazy parsing to the official implementation at some point. > However, the approach we'd probably take is to use it only on fields which > are explicitly marked with a "[lazy=true]" option. Developers would use > this to indicate fields for which the performance trade-offs favor lazy > parsing, and they are willing to deal with delayed error-checking. > > In your blog post you also mention that encoding the same message object > multiple times without modifying it in between, or parsing a message and > then serializing it without modification, is "free"... but how often does > this happen in practice? These seem like unlikely cases, and easy for the > user to optimize on their own without support from the protobuf > implementation. > > On Fri, Sep 18, 2009 at 3:15 PM, hi...@hiramchirino.com > <chir...@gmail.com>wrote: > > > > > Hi Kenton, > > > Your right, the reason that one benchmark has those results is because > > the implementation does lazy decoding. While lazy decoding is nice, I > > think that implementation has a couple of other features which are > > equally as nice. See more details about it them here: > > >http://hiramchirino.com/blog/2009/09/activemq-protobuf-implemtation-f... > > > It would have hard to impossible to implement some of the stuff > > without the completely different class structure it uses. I'd be > > happy if it's features could be absorbed into the official > > implementation. I'm just not sure how you could do that and maintain > > compatibility with your existing users. > > > If you have any suggestions of how we can integrate better please > > advise. > > > Regards, > > Hiram > > > On Sep 18, 12:34 pm, Kenton Varda <ken...@google.com> wrote: > > > So, his implementation is a little bit faster in two of the benchmarks, > > and > > > impossibly faster in the other one. I don't really believe that it's > > > possible to improve parsing time by as much as he claims, except by doing > > > something like lazy parsing, which would just be deferring the work to > > later > > > on. Would have been nice if he'd contributed his optimizations back to > > the > > > official implementation rather than write a whole new one... > > > > On Fri, Sep 18, 2009 at 1:38 AM, ijuma <ism...@juma.me.uk> wrote: > > > > > Hey all, > > > > > I ran across the following and thought it may be of interest to this > > > > list: > > > > >http://hiramchirino.com/blog/2009/09/activemq-protobuf-implementation. > > .. > > > > > Best, > > > > Ismael --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~----------~----~----~----~------~----~------~--~---