On Sat, Dec 6, 2008 at 12:42 AM, Kenton Varda <[EMAIL PROTECTED]> wrote:
> On Fri, Dec 5, 2008 at 10:59 PM, Alek Storm <[EMAIL PROTECTED]> wrote: > >> On Wed, Dec 3, 2008 at 5:32 AM, Kenton Varda <[EMAIL PROTECTED]> wrote: >> >>> Sorry, I think you misunderstood. The C++ parsers generated by protoc >>> (with optimize_for = SPEED) are an order of magnitude faster than the >>> dynamic *C++* parser (used with optimize_for = CODE_SIZE and >>> DynamicMessage). The Python parser is considerably slower than either of >>> them, but that's beside the point. Your "decoupled" parser which produces a >>> tag/value tree will be at least as slow as the existing C++ dynamic parser, >>> probably slower (since it sounds like it would use some sort of dictionary >>> structure rather than flat classes/structs). >>> >> >> Oh, I forgot we have two C++ parsers. The method I described uses the >> generated (SPEED) parser, so it should be a great deal quicker. It just >> outputs a tree instead of a message, leaving the smart object creation to >> Python. >> > > No, the static (SPEED) parser parses to generated C++ objects. It doesn't > make sense to say that we'll use the static parser to parse to this abstract > "tree" structure, because the whole point of the static parser is that it > parses to concrete objects. If it didn't, it wouldn't be so fast. (In > fact, the biggest bottleneck in protobuf parsing is memory bandwidth, and I > can't see how your "tree" structure would be anywhere near as compact as a > generated message class.) > Gah, you're right. I was thinking of it the wrong way. I still kinda like it, but since apparently the abstraction required would negate the speed increase, I guess it's time to drop it. > Honestly, I think using reflection for something as basic as changing the >> ouput format is hackish and could get ugly. >> > > I think you're thinking of a different kind of reflection. I'm talking > about the google::protobuf::Reflection interface. The whole point of this > interface is to allow you to do things like write custom codecs for things > like JSON or XML. Take a look at text_format.cc for an example usage. > Ah. I wasn't as familiar with the C++ version as I thought. Still, I thought it would be cool to have PB/XML/JSON/etc outputters operate at the same level. If a message Foo has a repeated field of type Bar, then the Bar objects in > that field are owned by Foo. When you delete Foo, all the Bars are > deleted. Leaving it up to the user to delete the Bar objects themselves is > way too much of a burden. > But it does give us a lot of cool functionality, like adding the same message to two parents, and (yes!) slicing support. I thought this was common practice in C++, but it's been quite a while since I've coded it. > Is there anything wrong with having a list of parents? I'm guessing I'm >> being naive - would speed be affected too much by that? > > > Way too complicated, probably a lot of overhead, and not very useful in > practice. > Is it really that useful to have ByteSize() cached for repeated fields? If it's not, we get everything I mentioned above for free. I'm genuinely not sure - it only comes up when serializing the message in wire_format.py. What do you think? Cheers, Alek Storm --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~----------~----~----~----~------~----~------~--~---
