New submission from Joel Barry: The openhook for fileinput currently will not be called when the input is from sys.stdin. However, if the input contains invalid UTF-8 sequences, a program with a hook that specifies errors='replace' will not behave as expected:
$ cat x.py import fileinput import sys def hook(filename, mode): print('hook called') return open(filename, mode, errors='replace') for line in fileinput.input(openhook=hook): sys.stdout.write(line) $ echo -e "foo\x80bar" >in.txt $ python3 x.py in.txt hook called foo�bar Good. Hook is called, and replacement character is observed. $ python3 x.py <in.txt Traceback (most recent call last): File "x.py", line 8, in <module> for line in fileinput.input(openhook=hook): File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/fileinput.py", line 263, in __next__ line = self.readline() File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/fileinput.py", line 363, in readline self._buffer = self._file.readlines(self._bufsize) File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/codecs.py", line 319, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 3: invalid start byte Hook was not called, and so we get the UnicodeDecodeError. Should fileinput attempt to apply the hook code to stdin? ---------- messages: 263409 nosy: jmb236 priority: normal severity: normal status: open title: fileinput handling of unicode errors from standard input type: behavior versions: Python 3.4 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26756> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com