Re: [Python-ideas] Adding an 'errors' argument to print
On Mon, Mar 27, 2017 at 8:52 PM, Barry wrote: > I took to using > > chcp 65001 > > This puts cmd.exe into unicode mode. conhost.exe hosts the console, and chcp.com is a console app that calls GetConsoleCP, SetConsoleCP and SetConsoleOutputCP to show or modify the console's input and output codepages. It doesn't support changing them separately. cmd.exe is just another console client, no different from python.exe or powershell.exe in this regard. Also, it's unrelated to how Python uses the console, but for the record, cmd has used the console's wide-character API since it was ported from OS/2 in the early 90s. Back then the console was hosted using threads in the csrss.exe system process, which made sense because the windowing system was hosted there. When they moved most of the window manager to kernel mode in NT 4 (1996), the console was mostly left behind in csrss.exe. It wasn't until Windows 7 that it found a new home in conhost.exe. In Windows 8 it got a real device driver instead of using fake file handles. In Windows 10 it was updated to be less of a franken-window -- e.g. now it has line-wrapped selection and text reflowing. Using codepage 65001 (UTF-8) in a console app has a couple of annoying bugs in the console itself, and another due to flushing of C FILE streams. For example, reading text that has even a single non-ASCII character will fail because conhost's encoding buffer is too small. It handles the error by returning a read of 0 bytes. That's EOF, so Python's REPL quits; input() raises EOFError; and stdin.read() returns an empty string. Microsoft should fix this in Windows 10, and probably will eventually. The Linux subsystem needs UTF-8, and it's silly that the console doesn't allow entering non-ASCII text in Linux programs. As was already recommended, I suggest using the wide-character API via win_unicode_console in 2.7 and 3.5. In 3.6 we get the wide-character API automatically thanks to Steve Dower's io._WindowsConsoleIO class. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
I took to using chcp 65001 This puts cmd.exe into unicode mode. Of course the python 3.6 make this uneccesary i understand. Barry > On 24 Mar 2017, at 15:41, Ryan Gonzalez wrote: > > Recently, I was working on a Windows GUI application that ends up running > ffmpeg, and I wanted to see the command that was being run. However, the file > name had a Unicode character in it (it's a Sawano song), and when I tried to > print it to the console, it crashed during the encode/decode. (The encoding > used in cmd doesn't support Unicode characters.) > > The workaround was to do: > > > print(mystring.encode(sys.stdout.encoding, > errors='replace).decode(sys.stdout.encoding)) > > > Not fun, especially since this was *just* a debug print. > > The proposal: why not add an 'errors' argument to print? That way, I could've > just done: > > > print(mystring, errors='replace') > > > without having to worry about it crashing. > > -- > Ryan (ライアン) > Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else > http://refi64.com > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
On 27 March 2017 at 13:10, Steve Dower wrote: > On 26Mar2017 0707, Nick Coghlan wrote: >> >> Perhaps it would be worth noting in the table of error handlers at >> https://docs.python.org/3/library/codecs.html#error-handlers that >> backslashreplace is used by the `ascii()` builtin and the associated >> format specifiers > > backslashreplace is also the default errors for stderr, which is arguably > the right target for debugging output. Perhaps what we really want is a > shorter way to send output to stderr? Though I guess it's an easy to invent > one-liner, once you know about the difference: > printe = partial(print, file=sys.stderr) If there was a printerror builtin that used sys.stderr as its default output stream, it could also special case BaseException instances to show their traceback. At the moment, we do force people to learn a few additional concepts in order to do error display "right": - processes have two standard output streams, stdout and stderr - Python makes those available in the sys module - the print() builtin function lets you specify a stream with "file" - so errors should be printed with "print(arg, file=sys.stderr)" - to get exception tracebacks like those at the interactive prompt, look at the traceback module As opposed to "for normal output, use 'print', for error output, use 'printerror', for temporary debugging output also use 'printerror', otherwise use the logging module". Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
On 26Mar2017 0707, Nick Coghlan wrote: Perhaps it would be worth noting in the table of error handlers at https://docs.python.org/3/library/codecs.html#error-handlers that backslashreplace is used by the `ascii()` builtin and the associated format specifiers backslashreplace is also the default errors for stderr, which is arguably the right target for debugging output. Perhaps what we really want is a shorter way to send output to stderr? Though I guess it's an easy to invent one-liner, once you know about the difference: >>> printe = partial(print, file=sys.stderr) Also worth noting that Python 3.6 supports Unicode characters on the console by default on Windows. So unless sys.stdout was manually constructed (a possibility, given this was a GUI app, though I designed the change such that `open("CON", "w")` would get it right), there wouldn't have been an encoding issue in the first place. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
Yes Python is turing complete, there is always a solution to everything. You can also do decorators with func = wrapper(func) instead of @wrapper, no need for a new syntax. Le 26/03/2017 à 20:42, Chris Angelico a écrit : > On Mon, Mar 27, 2017 at 5:22 AM, Michel Desmoulin > wrote: >> >> >> Le 26/03/2017 à 10:31, Victor Stinner a écrit : >>> print(msg) calls sys.stdout.write(msg): write() expects text, not bytes. >> >> What you are saying right now is that the API is not granular enough to >> just add a parameter. Not that it can't be done. It just mean we need to >> expose stdout.write() encoding behavior. >> >>> I dislike the idea of putting encoding options in print. It's too >>> specific. What if tomorrow you replace print() with file.write()? Do you >>> want to add errors there too? >> >> You would have to rewrite all your calls anyway, because print() call >> str() on things and accept already many parameters while file.write() >> doesn't. > > You can easily make a wrapper around print(), though. For example, > suppose you want a timestamped log file as well as the console: > > from builtins import print as pront # mess with people > @functools.wraps(pront) > def print(*a, **kw): > if "file" not in kw: > logging.info(kw.get("sep", " ").join(a)) > return pront(*a, **kw) > > Now what happens if you add the errors handler? Does this function > need to handle that somehow? > > ChrisA > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
On Mon, Mar 27, 2017 at 5:22 AM, Michel Desmoulin wrote: > > > Le 26/03/2017 à 10:31, Victor Stinner a écrit : >> print(msg) calls sys.stdout.write(msg): write() expects text, not bytes. > > What you are saying right now is that the API is not granular enough to > just add a parameter. Not that it can't be done. It just mean we need to > expose stdout.write() encoding behavior. > >> I dislike the idea of putting encoding options in print. It's too >> specific. What if tomorrow you replace print() with file.write()? Do you >> want to add errors there too? > > You would have to rewrite all your calls anyway, because print() call > str() on things and accept already many parameters while file.write() > doesn't. You can easily make a wrapper around print(), though. For example, suppose you want a timestamped log file as well as the console: from builtins import print as pront # mess with people @functools.wraps(pront) def print(*a, **kw): if "file" not in kw: logging.info(kw.get("sep", " ").join(a)) return pront(*a, **kw) Now what happens if you add the errors handler? Does this function need to handle that somehow? ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
Le 26/03/2017 à 10:31, Victor Stinner a écrit : > print(msg) calls sys.stdout.write(msg): write() expects text, not bytes. What you are saying right now is that the API is not granular enough to just add a parameter. Not that it can't be done. It just mean we need to expose stdout.write() encoding behavior. > I dislike the idea of putting encoding options in print. It's too > specific. What if tomorrow you replace print() with file.write()? Do you > want to add errors there too? You would have to rewrite all your calls anyway, because print() call str() on things and accept already many parameters while file.write() doesn't. > > No, it's better to write own formatter function as shown in a previous > email. print(encoding) is short, easy to use, unobtrusive and will be used ponctually. How is that using your own formatter function better ? ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
FWIW, using the ascii function does have the problem that Unicose characters will be escaped, even if the terminal could have handled them perfectly fine. -- Ryan (ライアン) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com On Mar 26, 2017 9:07 AM, "Nick Coghlan" wrote: > On 26 March 2017 at 18:31, Victor Stinner > wrote: > > print(msg) calls sys.stdout.write(msg): write() expects text, not bytes. > I > > dislike the idea of putting encoding options in print. It's too specific. > > What if tomorrow you replace print() with file.write()? Do you want to > add > > errors there too? > > > > No, it's better to write own formatter function as shown in a previous > > email. > > While I agree with that, folks that are thinking in terms of errors > handlers for str.encode may not immediately jump to using the > `ascii()` builtin or the "%a" or "!a" format specifiers, and if you > don't use those existing tools, you have the hassle of deciding where > to put your custom helper function. > > Perhaps it would be worth noting in the table of error handlers at > https://docs.python.org/3/library/codecs.html#error-handlers that > backslashreplace is used by the `ascii()` builtin and the associated > format specifiers, as well as noting the format specifiers in the > documentation of the builtin function? > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
On 26 March 2017 at 18:31, Victor Stinner wrote: > print(msg) calls sys.stdout.write(msg): write() expects text, not bytes. I > dislike the idea of putting encoding options in print. It's too specific. > What if tomorrow you replace print() with file.write()? Do you want to add > errors there too? > > No, it's better to write own formatter function as shown in a previous > email. While I agree with that, folks that are thinking in terms of errors handlers for str.encode may not immediately jump to using the `ascii()` builtin or the "%a" or "!a" format specifiers, and if you don't use those existing tools, you have the hassle of deciding where to put your custom helper function. Perhaps it would be worth noting in the table of error handlers at https://docs.python.org/3/library/codecs.html#error-handlers that backslashreplace is used by the `ascii()` builtin and the associated format specifiers, as well as noting the format specifiers in the documentation of the builtin function? Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
print(msg) calls sys.stdout.write(msg): write() expects text, not bytes. I dislike the idea of putting encoding options in print. It's too specific. What if tomorrow you replace print() with file.write()? Do you want to add errors there too? No, it's better to write own formatter function as shown in a previous email. Victor Le 25 mars 2017 8:50 PM, "Michel Desmoulin" a écrit : Le 24/03/2017 à 17:37, Victor Stinner a écrit : > *If* we change something, I would prefer to modify sys.stdout. The > following issue proposes to add > sys.stdout.set_encoding(errors='replace'): > http://bugs.python.org/issue15216 > > You can already set the PYTHONIOENCODING environment variable to > ":replace" to use "replace" on sys.stdout (and sys.stderr). > > Victor This is not the same. You may want to locally apply "errors=replace" and not the whole program. Indeed, this can silence encoding problems. So I would probably never set in to errors at dev time except for the few places where I know I can explicitly silence errors. I quite like this print(errors="replace|ignore"). This is not going to cause any trouble, and can only help. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
Le 24/03/2017 à 17:37, Victor Stinner a écrit : > *If* we change something, I would prefer to modify sys.stdout. The > following issue proposes to add > sys.stdout.set_encoding(errors='replace'): > http://bugs.python.org/issue15216 > > You can already set the PYTHONIOENCODING environment variable to > ":replace" to use "replace" on sys.stdout (and sys.stderr). > > Victor This is not the same. You may want to locally apply "errors=replace" and not the whole program. Indeed, this can silence encoding problems. So I would probably never set in to errors at dev time except for the few places where I know I can explicitly silence errors. I quite like this print(errors="replace|ignore"). This is not going to cause any trouble, and can only help. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
On Fri, Mar 24, 2017 at 10:41:58AM -0500, Ryan Gonzalez wrote: > Recently, I was working on a Windows GUI application that ends up running > ffmpeg, and I wanted to see the command that was being run. However, the > file name had a Unicode character in it (it's a Sawano song), and when I > tried to print it to the console, it crashed during the encode/decode. (The > encoding used in cmd doesn't support Unicode characters.) *Crash* crash, or just an exception? If it crashed the interpreter, you ought to report that as a bug. > The workaround was to do: > > > print(mystring.encode(sys.stdout.encoding, > errors='replace).decode(sys.stdout.encoding)) I think that this would be both simpler and more informative: print(ascii(mystring)) -- Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
On 24 March 2017 at 16:37, Victor Stinner wrote: > *If* we change something, I would prefer to modify sys.stdout. The > following issue proposes to add > sys.stdout.set_encoding(errors='replace'): > http://bugs.python.org/issue15216 I thought I recalled seeing something like that discussed somewhere. I agree that this is a better approach (even though it's not as granular as being able to specify on an individual print statement). > You can already set the PYTHONIOENCODING environment variable to > ":replace" to use "replace" on sys.stdout (and sys.stderr). That's something I didn't know. Thanks for the pointer! Paul ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
On Fri, Mar 24, 2017 at 9:37 AM, Victor Stinner wrote: > *If* we change something, I would prefer to modify sys.stdout. The > following issue proposes to add > sys.stdout.set_encoding(errors='replace'): > http://bugs.python.org/issue15216 > I like that. > You can already set the PYTHONIOENCODING environment variable to > ":replace" to use "replace" on sys.stdout (and sys.stderr). > Great tip, I've needed this! -- --Guido van Rossum (python.org/~guido) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
*If* we change something, I would prefer to modify sys.stdout. The following issue proposes to add sys.stdout.set_encoding(errors='replace'): http://bugs.python.org/issue15216 You can already set the PYTHONIOENCODING environment variable to ":replace" to use "replace" on sys.stdout (and sys.stderr). Victor 2017-03-24 16:41 GMT+01:00 Ryan Gonzalez : > Recently, I was working on a Windows GUI application that ends up running > ffmpeg, and I wanted to see the command that was being run. However, the > file name had a Unicode character in it (it's a Sawano song), and when I > tried to print it to the console, it crashed during the encode/decode. (The > encoding used in cmd doesn't support Unicode characters.) > > The workaround was to do: > > > print(mystring.encode(sys.stdout.encoding, > errors='replace).decode(sys.stdout.encoding)) > > > Not fun, especially since this was *just* a debug print. > > The proposal: why not add an 'errors' argument to print? That way, I > could've just done: > > > print(mystring, errors='replace') > > > without having to worry about it crashing. > > -- > Ryan (ライアン) > Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else > http://refi64.com > > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
On 24 March 2017 at 15:41, Ryan Gonzalez wrote: > Recently, I was working on a Windows GUI application that ends up running > ffmpeg, and I wanted to see the command that was being run. However, the > file name had a Unicode character in it (it's a Sawano song), and when I > tried to print it to the console, it crashed during the encode/decode. (The > encoding used in cmd doesn't support Unicode characters.) > > The workaround was to do: > > > print(mystring.encode(sys.stdout.encoding, > errors='replace).decode(sys.stdout.encoding)) > > > Not fun, especially since this was *just* a debug print. > > The proposal: why not add an 'errors' argument to print? That way, I > could've just done: > > > print(mystring, errors='replace') > > > without having to worry about it crashing. When I've hit issues like this before, I've written a helper function: def sanitise(str, enc): """Ensure that str can be encoded in encoding enc""" return str.encode(enc, errors='replace').decode(enc) An errors argument to print would be very similar, but would only apply to the print function, whereas I've used my sanitise function in other situations as well. I understand the attraction of a dedicated "just print the best representation you can" argument to print, but I'm not sure it's a common enough need to be worth adding like this. Paul ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Adding an 'errors' argument to print
Recently, I was working on a Windows GUI application that ends up running ffmpeg, and I wanted to see the command that was being run. However, the file name had a Unicode character in it (it's a Sawano song), and when I tried to print it to the console, it crashed during the encode/decode. (The encoding used in cmd doesn't support Unicode characters.) The workaround was to do: print(mystring.encode(sys.stdout.encoding, errors='replace).decode(sys.stdout.encoding)) Not fun, especially since this was *just* a debug print. The proposal: why not add an 'errors' argument to print? That way, I could've just done: print(mystring, errors='replace') without having to worry about it crashing. -- Ryan (ライアン) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/