Nick Coghlan added the comment:
To bring back Victor's comments from the list:
- stdout/stderr are fairly easy to handle, since the underlying buffers can be
flushed before switching the encoding and error settings. Yes, there's a risk
of creating mojibake, but that's unavoidable and, for this use case, trumped by
the pragmatic need to support overriding the output encoding in a robust
fashion (i.e. not breaking sys.__stdout__ or sys.__stderr__, and not crashing
if something else displays output during startup, for example, when running
under "python -v")
- stdin is more challenging, since it isn't entirely clear yet how to handle
the case where data is already buffered internally. Victor proposes that it's
acceptable to simply disallow changing the encoding of a stream that isn't
seekable. My feeling is that such a restriction would largely miss the point,
since the original use case that prompted the creation of this was shell
pipeline processing, where stdin will often be a PIPE
I think the guiding use case here really needs to be this one: "How do I
implement the equivalent of 'iconv' as a Python 3 script, without breaking
internal interpreter state invariants?"
My current thought is that, instead of seeking, the input case can better be
handled by manipulating the read ahead buffer directly. Something like (for the
pure Python version):
self._encoding = new_encoding
if self._decoder is not None:
old_data = self._get_decoded_chars().encode(old_encoding)
old_data += self._decoder.getstate()[0]
decoder = self._get_decoder()
new_chars = ''
if old_data:
new_chars = decoder.decode(old_data)
self._set_decoded_chars(new_chars)
(A similar mechanism could actually be used to support an "initial_data"
parameter to TextIOWrapper, which would help in general encoding detection
situations where changing encoding *in-place* isn't needed, but the application
would like an easy way to "put back" the initial data for inclusion in the text
stream without making assumptions about the underlying buffer implementation)
Also, StringIO should implement this new API as a no-op.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue15216>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com