Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

Glenn Linderman Sat, 11 Jan 2014 17:19:14 -0800

On 1/11/2014 1:50 PM, Ethan Furman wrote:

Perhaps that's the problem.  According to the docs:
========================================================================
 object.__bytes__(self)
Called by bytes() to compute a byte-string representation of anobject. This should return a bytes object.
========================================================================
Obviously, with the plethora of different binary possibilities forrepresenting a number (how many bytes? endianness? which complement?),we would be well within our rights to decide that the "byte-stringrepresentation" of the numeric types is the ASCII equivalent of their__repr__ or __str__, and implement __bytes__ appropriately for them.Any other object that wants to be represented easily in a byte streamwould also have to implement __bytes__. If necessary we could add__bytes__ to str for /strict/ ASCII conversion (even latin-1 wouldhave to be explicitly encoded)[1].

In spite of Victor's explanation of internals, which I didn'tunderstand, this sounds like a very interesting idea, conceptually, thatany object could implement its __bytes__representation.

On the other hand, it would probably have to be parameterized in thegeneral case: for binary data values, one protocol or format may wishthe data to be big-endian, and another may wish the data to belittle-endian; for str, one protocol or format may require one encodingand another may require a different encoding, even (as for email) fordifferent parts of the message. So it could be somewhat complex, yetwould be very powerful in allowing complex objects, made up of otherobjects, some of which might have a variety of potential bytes formats(think TIFF files, for example) to convert themselves into a stream ofbytes that fits the standard. On the flip side, one would want toconvert the stream of bytes into the set of objects, which is a parsingproblem.

This is a bit beyond what can be done automatically, just by calling__bytes__ with no parameters, though.

What it may be, though, is a meta-operation from which the needed bytesoperations can be determined. It may also not be an easy "compatiblewith existing Python 2 code with minor tweaks" solution, either. Itwould be more like a pickle protocol, but pickle defines its ownformats, and thus is useless for creating standard formats.


I guess it would belong on python-ideas.

_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

Reply via email to