Greg Ewing wrote:
Stephen J. Turnbull wrote:
This discussion isn't about whether it could be done or not, it's
about where people expect to find such functionality.  Personally, if
I can find .encode('euc-jp') on a string object, I would expect to
find .encode('gzip') on a bytes object, too.

What I'm not seeing is a clear rationale on where you
draw the line. Out of all the possible transformations
between a string and some other kind of data, which
ones deserve to be available via this rather strange
and special interface, and why?


Where this kind of unified interface to binary and character transforms is incredibly handy is in a stacking IO model like the one used in Py3k. For example, suppose you're using a compressed XML stream to communicate over a network socket. What this approach allows you to do is have generic 'transformation' layers in your IO stack, so you can just build up your IO stack as something like:

XMLParserIO('myschema')
BufferedTextIO('utf-8')
BytesTransform('gzip')
RawSocketIO

To change to a different compression mechanism (e.g. bz2), you just chance the codec used by the BytesTransform layer from 'gzip' to 'bz2'.

As for how you choose what to provide as codecs... well, that's a major reason why the codec registry is extensible. The answer is that any binary or character transform which is useful to the application programmer can be accessed via the codec API - the only question will be whether the application programmer will have to write the codec themselves, or will find it already provided in the standard library.

Cheers,
Nick.

P.S. My original tangential response that didn't actually answer your question, but may still be useful to some folks:

An actual codec that encodes a character string to a byte sequence, and decodes a byte sequence back to a character string would be invoked via the str.encode() and bytes.decode() methods. For example, mystr.encode('utf-8') to serialise a string using UTF-8, mybytes.decode('utf-8') to read it back.

A text transform that converts a character string to a different character string would be invoked via the str.transform() and str.untransform() methods. For example, mystr.transform('unicode-escape') to convert unicode characters to their \u or \U equivalents, mystr.untransform('unicode-escape') to convert them back to the actual unicode characters.

A binary transform that converts a byte sequence to a different byte sequence would be invoked via the bytes.transform() and bytes.untransform() methods. For example, mybytes.transform('gzip') to compress a byte sequence, mybytes.untransform('gzip') to decompress it.



--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---------------------------------------------------------------
            http://www.boredomandlaziness.org
_______________________________________________
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Reply via email to