On Wed, Dec 3, 2008 at 4:30 AM, Alek Storm <[EMAIL PROTECTED]> wrote:

> (Okay, back on track)
>
> On Tue, Dec 2, 2008 at 11:17 PM, Kenton Varda <[EMAIL PROTECTED]> wrote:
>
>> On Tue, Dec 2, 2008 at 11:08 PM, Alek Storm <[EMAIL PROTECTED]> wrote:
>>
>>> I would think encoding and decoding would be the main bottlenecks, so
>>> can't those be wrappers around C++, while let object handling (reflection.py
>>> and friends) be pure-python?  It seems like the best of both worlds.
>>>
>>
>>>
>> Well, the generated serializing and parsing code in C++ is an order of
>> magnitude faster than the dynamic (reflection-based) code.  But to use
>> generated code you need to be using C++ object handling.
>>
>
> Not if you decouple them.  Abstractly, the C++ parser receives a serialized
> message and descriptor and returns a tree of the form [(tag_num, value)]
> where tag_num is an integer and value is either a scalar or a subtree (for
> submessages).  The Python reflection code takes the tree and fills the
> message object with its values.  It's simple, fast, and the C++ parser can
> be easily swapped out for a pure-Python one on systems that don't support
> the C++ version.
>
> Run this backwards when serializing, and you get another advantage: you can
> easily swap out the function that converts the tree into serialized protobuf
> for one that outputs XML, JSON, etc.
>

It's not that simple. We would also like to improve performance at least in
MergeFrom/CopyFrom/ParseASCII/IsInitialized.


>
>
>> You're right.  If it's a waste of time for them, most people won't use
>>> it.  But if there's no point to it, why do normal Python lists have it?
>>> It's useful enough to be included there.  And since repeated fields act just
>>> like lists, it should be included here too.
>>
>>
>> I think Python object lists are probably used in a much wider variety of
>> ways than protocol buffer repeated fields generally are.
>>
>
> Let's include it - it gives us a more complete list interface, there's no
> downside, and the users can decide whether they want to use it.  We can't
> predict all possible use cases.
>

The thing is, when they start to use it, you can't remove it later if it
turns to be a problem ...


>
>  In fact, it doesn't even have to be useful for repeated composites.  The
>>> fact that repeated scalars have it means it's automatically included for
>>> repeated composites, because they should have the exact same interface.
>>> Polymorphism is what we want here.
>>
>>
>> But they already can't have the same interface because append() doesn't
>> work.  :)
>>
>
> We don't have confirmation on that yet ;).  Having the same interface is
> what we should be shooting for.
>

Currently each composite field has a reference to its parent. This makes it
impossible to add the same composite to two different repeated composite
fields. The .add() method guarantees that this never happens.

Take a look at this example:

.proto:
message M1 {
  optional int32 i = 1;
}

message M2 {
  repeated M1 m1 = 1;
}

message M3 {
  repeated M1 m1 = 1;
}

usage:
m2 = M2()
m3 = M3()
m1 = M1()
m1.i = 1

m2.m1.append(m1)
m3.m1.append(m1)
print m2.ByteSize() # Correct
print m3.ByteSize() # Correct

m1.i = 11111111 # This should mark m2.ByteSize() and m3.ByteSize() dirty.
print m2.ByteSize() # Incorrect - because m1 references its new parent m3,
and when m1 it gets updated, it only notifies m3.
print m3.ByteSize() # Correct

I think protobuf's repeated composite fields aren't and shouldn't be
equivalent to python lists.


>
> Thanks,
> Alek Storm
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to