On Thu, Jan 5, 2017 at 5:54 PM, Serhiy Storchaka <storch...@gmail.com> wrote:
> On 05.01.17 22:37, Alexander Belopolsky wrote: > >> I propose the following: >> >> 1. For 3.6, restore and document 3.5 behavior. Recommend that 3rd party >> types that are both integer-like and buffer-like implement their own >> __bytes__ method to resolve the bytes(x) ambiguity. >> > > The __bytes__ method is used only by the bytes constructor, not by the > bytearray constructor. I am not sure this is deliberate. See < https://bugs.python.org/issue2415#msg71660>. > > > 2. For 3.7, I would like to see a drastically simplified bytes(x): >> 2.1. Accept only objects with a __bytes__ method or a sequence of ints >> in range(256). >> 2.2. Expand __bytes__ definition to accept optional encoding and errors >> parameters. Implement str.__bytes__(self, [encoding[, errors]]). >> > > I think it is better to use the encode() method if you want to encode from > non-strings. Possibly, but the goal of my proposal is to lighten the logic in the bytes(x, [encoding[, errors]]) constructor. If it detects x.__bytes__, it should just call it with whatever arguments are given. > > > 2.3. Implement new specialized bytes.fromsize and bytes.frombuffer >> constructors as per PEP 467 and Inada Naoki proposals. >> > > bytes.fromsize(n) is just b'\0'*n. I don't think this method is needed. > I don't care much about this. If it helps removing bytes(int) case, I am for it, otherwise ±0. > > bytes.frombuffer(x) is bytes(memoryview(x)) or memoryview(x).tobytes(). I've just tried Inada's patch < http://bugs.python.org/issue29178 <http://bugs.python.org/issue29178>>: $ ./python.exe -m timeit -s "from array import array; x=array('f', [0])" "bytes.frombuffer(x)" 2000000 loops, best of 5: 134 nsec per loop $ ./python.exe -m timeit -s "from array import array; x=array('f', [0])" "with memoryview(x) as m: bytes(m)" 500000 loops, best of 5: 436 nsec per loop A 3x speed-up seems to be worth it. > > > 2.4. Implement memoryview.__bytes__ method so that bytes(memoryview(x)) >> works ad before. >> 2.5. Implement a fast bytearray.__bytes__ method. >> > > This wouldn't help for the bytearray constructor. And wouldn't allow to > avoid double copying in the constructor of bytes subclass. I don't see why bytearray constructor should behave differently from bytes. > > 3. Consider promoting __bytes__ to a tp_bytes type slot. >> > > The buffer protocol is more general than the __bytes__ method. It allows > to avoid redundant memory copying in constructors of many types (bytes, > bytearray, array.array, etc), not just bytes. > It looks like there are two different views on what the bytes type represents. Is it a sequence of small integers or a blob of binary data? Compare these two calls: >>> from array import array >>> bytes(array('h', [1, 2, 3])) b'\x01\x00\x02\x00\x03\x00' and >>> bytes(array('f', [1, 2, 3])) b'\x00\x00\x80?\x00\x00\x00@\x00\x00@@' For me the __bytes__ method is a way for types to specify their bytes representation that may or may not be the same as memoryview(x).tobytes().
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com