Re: Slicing support in Python

Alek Storm Sat, 06 Dec 2008 01:03:48 -0800

On Sat, Dec 6, 2008 at 12:42 AM, Kenton Varda <[EMAIL PROTECTED]> wrote:


> On Fri, Dec 5, 2008 at 10:59 PM, Alek Storm <[EMAIL PROTECTED]> wrote:
>
>> On Wed, Dec 3, 2008 at 5:32 AM, Kenton Varda <[EMAIL PROTECTED]> wrote:
>>
>>>  Sorry, I think you misunderstood.  The C++ parsers generated by protoc
>>> (with optimize_for = SPEED) are an order of magnitude faster than the
>>> dynamic *C++* parser (used with optimize_for = CODE_SIZE and
>>> DynamicMessage).  The Python parser is considerably slower than either of
>>> them, but that's beside the point.  Your "decoupled" parser which produces a
>>> tag/value tree will be at least as slow as the existing C++ dynamic parser,
>>> probably slower (since it sounds like it would use some sort of dictionary
>>> structure rather than flat classes/structs).
>>>
>>
>> Oh, I forgot we have two C++ parsers.  The method I described uses the
>> generated (SPEED) parser, so it should be a great deal quicker.  It just
>> outputs a tree instead of a message, leaving the smart object creation to
>> Python.
>>
>
> No, the static (SPEED) parser parses to generated C++ objects.  It doesn't
> make sense to say that we'll use the static parser to parse to this abstract
> "tree" structure, because the whole point of the static parser is that it
> parses to concrete objects.  If it didn't, it wouldn't be so fast.  (In
> fact, the biggest bottleneck in protobuf parsing is memory bandwidth, and I
> can't see how your "tree" structure would be anywhere near as compact as a
> generated message class.)
>

Gah, you're right.  I was thinking of it the wrong way.  I still kinda like
it, but since apparently the abstraction required would negate the speed
increase, I guess it's time to drop it.


> Honestly, I think using reflection for something as basic as changing the
>> ouput format is hackish and could get ugly.
>>
>
> I think you're thinking of a different kind of reflection.  I'm talking
> about the google::protobuf::Reflection interface.  The whole point of this
> interface is to allow you to do things like write custom codecs for things
> like JSON or XML.  Take a look at text_format.cc for an example usage.
>

Ah.  I wasn't as familiar with the C++ version as I thought.  Still, I
thought it would be cool to have PB/XML/JSON/etc outputters operate at the
same level.

If a message Foo has a repeated field of type Bar, then the Bar objects in
> that field are owned by Foo.  When you delete Foo, all the Bars are
> deleted.  Leaving it up to the user to delete the Bar objects themselves is
> way too much of a burden.
>

But it does give us a lot of cool functionality, like adding the same
message to two parents, and (yes!) slicing support.  I thought this was
common practice in C++, but it's been quite a while since I've coded it.


> Is there anything wrong with having a list of parents?  I'm guessing I'm
>> being naive - would speed be affected too much by that?
>
>
> Way too complicated, probably a lot of overhead, and not very useful in
> practice.
>

Is it really that useful to have ByteSize() cached for repeated fields?  If
it's not, we get everything I mentioned above for free.  I'm genuinely not
sure - it only comes up when serializing the message in wire_format.py.
What do you think?

Cheers,
Alek Storm

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Slicing support in Python

Reply via email to