On 9/26/07, Guido van Rossum <[EMAIL PROTECTED]> wrote: > > Constructors > ------------ > > There are four forms of constructors, applicable to both bytes and > buffer: > > - ``bytes(<bytes>)``, ``bytes(<buffer>)``, ``buffer(<bytes>)``, > ``buffer(<buffer>)``: simple copying constructors, with the note > that ``bytes(<bytes>)`` might return its (immutable) argument. > > - ``bytes(<str>, <encoding>[, <errors>])``, ``buffer(<str>, > <encoding>[, <errors>])``: encode a text string. Note that the > ``str.encode()`` method returns an *immutable* bytes object. > The <encoding> argument is mandatory; <errors> is optional. > > - ``bytes(<memory view>)``, ``buffer(<memory view>)``: construct a > bytes or buffer object from anything that supports the PEP 3118 > buffer API. > > - ``bytes(<iterable of ints>)``, ``buffer(<iterable of ints>)``: > construct an immutable bytes or mutable buffer object from a > stream of integers in range(256). > > - ``buffer(<int>)``: construct a zero-initialized buffer of a given > lenth. >
I think this section could be better organized. I had to read a few time to fully understand it. Maybe a table would emphasize better the differences between the two constructors. > Indexing > -------- > > **Open Issue:** I'm undecided on whether indexing bytes and buffer > objects should return small ints (like the bytes type in 3.0a1, and > like lists or array.array('B')), or bytes/buffer objects of length 1 > (like the str type). The latter (str-like) approach will ease porting > code from Python 2.x; but it makes it harder to extract values from a > bytes array. I think indexing a bytes/buffer object should return an int. I find this behavior more natural, to me, than using an ord()-like function to extract values. In fact, I remarked that the use of ord() is good indicator that bytes should be used instead of str (look by yourself: grep -R --include='*.py' 'ord(' python25/Lib). > Str() and Repr() > ---------------- > > The str() and repr() functions return the same thing for these > objects. The repr() of a bytes object returns a b'...' style literal. > The repr() of a buffer returns a string of the form "buffer(b'...')". Does that mean calling str() on a bytes/buffer object -- e.g., str(b"abc") -- wouldn't decode the content of the object (like array objects)? > Bytes and the Str Type > ---------------------- > > Like the bytes type in Python 3.0a1, and unlike the relationship > between str and unicode in Python 2.x, any attempt to mix bytes (or > buffer) objects and str objects without specifying an encoding will > raise a TypeError exception. This is the case even for simply > comparing a bytes or buffer object to a str object (even violating the > general rule that comparing objects of different types for equality > should just return False). > > Conversions between bytes or buffer objects and str objects must > always be explicit, using an encoding. There are two equivalent APIs: > ``str(b, <encoding>[, <errors>])`` is equivalent to > ``b.encode(<encoding>[, <errors>])``, and > ``bytes(s, <encoding>[, <errors>])`` is equivalent to > ``s.decode(<encoding>[, <errors>])``. > > There is one exception: we can convert from bytes (or buffer) to str > without specifying an encoding by writing ``str(b)``. This produces > the same result as ``repr(b)``. This exception is necessary because > of the general promise that *any* object can be printed, and printing > is just a special case of conversion to str. There is however no > promise that printing a bytes object interprets the individual bytes > as characters (unlike in Python 2.x). Ah! That answers my last question. :) -- Alexandre _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com