Re: Slicing support in Python

Kenton Varda Wed, 03 Dec 2008 05:32:48 -0800

On Wed, Dec 3, 2008 at 12:02 AM, Alek Storm <[EMAIL PROTECTED]> wrote:

> On Tue, Dec 2, 2008 at 11:17 PM, Kenton Varda <[EMAIL PROTECTED]> wrote:
>
>> Well, the generated serializing and parsing code in C++ is an order of
>> magnitude faster than the dynamic (reflection-based) code.  But to use
>> generated code you need to be using C++ object handling.
>>
>
> Not if you decouple them.  Abstractly, the C++ parser receives a serialized
> message and descriptor and returns a tree of the form [(tag_num, value)]
> where tag_num is an integer and value is either a scalar or a subtree (for
> submessages).  The Python reflection code takes the tree and fills the
> message object with its values.  It's simple, fast, and the C++ parser can
> be easily swapped out for a pure-Python one on systems that don't support
> the C++ version.
>

Sorry, I think you misunderstood.  The C++ parsers generated by protoc (with
optimize_for = SPEED) are an order of magnitude faster than the dynamic
*C++* parser (used with optimize_for = CODE_SIZE and DynamicMessage).  The
Python parser is considerably slower than either of them, but that's beside
the point.  Your "decoupled" parser which produces a tag/value tree will be
at least as slow as the existing C++ dynamic parser, probably slower (since
it sounds like it would use some sort of dictionary structure rather than
flat classes/structs).

> Run this backwards when serializing, and you get another advantage: you can
> easily swap out the function that converts the tree into serialized protobuf
> for one that outputs XML, JSON, etc.
>

You can already easily write encoders and decoders for alternative formats
using reflection.

>
>
>
>> You're right.  If it's a waste of time for them, most people won't use
>>> it.  But if there's no point to it, why do normal Python lists have it?
>>> It's useful enough to be included there.  And since repeated fields act just
>>> like lists, it should be included here too.
>>
>>
>> I think Python object lists are probably used in a much wider variety of
>> ways than protocol buffer repeated fields generally are.
>>
>
> Let's include it - it gives us a more complete list interface, there's no
> downside, and the users can decide whether they want to use it.  We can't
> predict all possible use cases.
>

Ah, yes, the old "Why not?" argument.  :)  Actually, I far prefer the
opposite argument:  If you aren't sure if someone will want a feature, don't
include it.  There is always a down side to including a feature.  Even if
people choose not to use it, it increases code size, maintenance burden,
memory usage, and interface complexity.  Worse yet, if people do use it,
then we're permanently stuck with it, whether we like it or not.  We can't
change it later, even if we decide it's wrong.  For example, we may decide
later -- based on an actual use case, perhaps -- that it would really have
been better if remove() compared elements by content rather than by
identity, so that you could remove a message from a repeated field by
constructing an identical message and then calling remove().  But we
wouldn't be able to change it.  We'd have to instead add a different method
like removeByValue(), which would be ugly and add even more complexity.

Protocol Buffers got where they are by stubbornly refusing the vast majority
of feature suggestions.  :)

That said, you do have a good point that the interface should be similar to
standard Python lists if possible.  But given the other problems that
prevent this, it seems like a moot point.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Slicing support in Python

Reply via email to