New submission from Nick Coghlan:

Issue 20404 points out that io.TextIOWrapper can't be used with binary 
transform codecs like bz2 because the types are wrong.

By contrast, codecs.open() still defaults to working in binary mode, and just 
switches to returning a different type based on the specified encoding (exactly 
the kind of value-driven output type changes we're trying to eliminate from the 
core text model):

>>> import codecs
>>> print(codecs.open('hex.txt').read())
b'aabbccddeeff'
>>> print(codecs.open('hex.txt', encoding='hex').read())
b'\xaa\xbb\xcc\xdd\xee\xff'
>>> print(codecs.open('hex.txt', encoding='utf-8').read())
aabbccddeeff

While for 3.4, I plan to just extend the issue 19619 blacklist to also cover 
TextIOWrapper (and hence open()), it seems to me that there is a valid use case 
for bytes-to-bytes transform support directly in the IO stack.

A PEP for 3.5 could propose:

- providing a public API that allows codecs to be classified into at least the 
following groups ("binary" = memorview compatible data exporters, including 
both bytes and bytearray):
  - text encodings (decodes binary to str, encodes str to bytes)
  - binary transforms (decodes *and* encodes binary to bytes)
  - text transforms (decodes and encodes str to str)
  - hybrid transforms (acts as both a binary transform *and* as a text 
transform)
  - hybrid encodings (decodes binary and potentially str to str, encodes binary 
and str to bytes)
  - arbitrary encodings (decodes and encodes object to object, without fitting 
any of the above categories)

- adding io.BinaryTransformWrapper that applies binary transforms when reading 
and writing data (similar to the way TextIOWrapper applies text encodings)

- adding a "transform" parameter to open that inserts BinaryTransformWrapper 
into the stack at the appropriate place (the PEP process would need to decide 
between supporting just a single transform per stream or multiple). In text 
mode, TextIOWrapper would be added to the stack after any binary transforms.

Optionally, the idea could also be extended to adding io.TextTransformWrapper 
and a "text_transform" parameter, but those seem somewhat less useful.

----------
components: IO, Interpreter Core, Library (Lib)
messages: 209398
nosy: benjamin.peterson, ezio.melotti, haypo, hynek, lemburg, ncoghlan, pitrou, 
serhiy.storchaka, stutzbach
priority: normal
severity: normal
stage: needs patch
status: open
title: Add io.BinaryTransformWrapper and a "transform" parameter to open()
type: enhancement
versions: Python 3.5

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20405>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to