Re: [Python-3000] PEP 3137: Immutable Bytes and Mutable Buffer

Nick Coghlan Thu, 27 Sep 2007 08:38:23 -0700

Guido van Rossum wrote:
> [PEP 3137]
>>> **Open Issue:** I'm undecided on whether indexing bytes and buffer
>>> objects should return small ints (like the bytes type in 3.0a1, and
>>> like lists or array.array('B')), or bytes/buffer objects of length 1
>>> (like the str type).  The latter (str-like) approach will ease porting
>>> code from Python 2.x; but it makes it harder to extract values from a
>>> bytes array.
> 
> On 9/26/07, Brett Cannon <[EMAIL PROTECTED]> wrote:
>> How much do you care about making the 2 -> 3 transition easy?  If you
>> don't go the str way then comparisons like ``bytes_[0] == b"A"`` won't
>> work unless you allow comparisons between ints and length 1
>> bytes/buffers.  Extracting a single item is not horrendous if you pass
>> it to int().
>>
>> Personally I say go with the list-like semantics.  Having the
>> following code return false seems odd (but not ridiculous) to me::
>>
>>   stuff = bytes([0, 1])
>>   stuff[1] = 42
>>   stuff[1] == 42
>>
>> So unless int comparisons are allowed I am -0 on the str-like semantics.
> 
> int comparisons would stick out like a sore thumb, especially since
> they can only be reasonably made to work on 1-byte strings.
> 
> I'm still undecided (despite Marcin's eloquent argument for ints as
> bytes) but I'm open for votes for this case.


Making an iterator over an integer sequence acceptable in the 
constructor strongly suggests that a byte sequence contains integers 
between 0 and 255 inclusive, not length 1 byte sequences.

And I think that's the cleanest conceptual model for them as well. A 
byte sequence doesn't contain length 1 byte sequences, it contains bytes 
(i.e. numbers between 0 and 255 inclusive).

For direct comparison, a slice works fine:

   if data[0:1] == b'x':
     print "Starts with x!"

The only problematic case is cases such as iterating over a byte 
sequence where we may have an integer and want to compare it to a length 
1 byte string. With just the simple conceptual model, we would have to 
write one of:

   if val == b'x'[0]:
   if bytes([val]) == b'x':
   if val == ord(b'x'):

I don't think it's worth breaking the conceptual model of the data type 
just to reduce the simplest spelling of that comparison by 3 characters.

However, I do think it may be worth having an additional iterator on 
bytes and buffer objects:

   def fragments(self, size=1): # Could do with a better name
     for i in range(len(self)):
       yield self[i:i+size]

Then the problematic example could be written:

   for val in data.fragments():
       if val == b'x':
           print "Found an x!"

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org
_______________________________________________
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Re: [Python-3000] PEP 3137: Immutable Bytes and Mutable Buffer

Reply via email to