Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-27 Thread eryk sun
On Mon, Mar 27, 2017 at 8:52 PM, Barry  wrote:
> I took to using
>
>  chcp 65001
>
> This puts cmd.exe into unicode mode.

conhost.exe hosts the console, and chcp.com is a console app that
calls GetConsoleCP, SetConsoleCP and SetConsoleOutputCP to show or
modify the console's input and output codepages. It doesn't support
changing them separately.

cmd.exe is just another console client, no different from python.exe
or powershell.exe in this regard. Also, it's unrelated to how Python
uses the console, but for the record, cmd has used the console's
wide-character API since it was ported from OS/2 in the early 90s.

Back then the console was hosted using threads in the csrss.exe system
process, which made sense because the windowing system was hosted
there. When they moved most of the window manager to kernel mode in NT
4 (1996), the console was mostly left behind in csrss.exe. It wasn't
until Windows 7 that it found a new home in conhost.exe. In Windows 8
it got a real device driver instead of using fake file handles. In
Windows 10 it was updated to be less of a franken-window -- e.g. now
it has line-wrapped selection and text reflowing.

Using codepage 65001 (UTF-8) in a console app has a couple of annoying
bugs in the console itself, and another due to flushing of C FILE
streams. For example, reading text that has even a single non-ASCII
character will fail because conhost's encoding buffer is too small. It
handles the error by returning a read of 0 bytes. That's EOF, so
Python's REPL quits; input() raises EOFError; and stdin.read() returns
an empty string. Microsoft should fix this in Windows 10, and probably
will eventually. The Linux subsystem needs UTF-8, and it's silly that
the console doesn't allow entering non-ASCII text in Linux programs.

As was already recommended, I suggest using the wide-character API via
win_unicode_console in 2.7 and 3.5. In 3.6 we get the wide-character
API automatically thanks to Steve Dower's io._WindowsConsoleIO class.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-27 Thread Barry
I took to using

 chcp 65001

This puts cmd.exe into unicode mode.

Of course the python 3.6 make this uneccesary i understand.

Barry


> On 24 Mar 2017, at 15:41, Ryan Gonzalez  wrote:
> 
> Recently, I was working on a Windows GUI application that ends up running 
> ffmpeg, and I wanted to see the command that was being run. However, the file 
> name had a Unicode character in it (it's a Sawano song), and when I tried to 
> print it to the console, it crashed during the encode/decode. (The encoding 
> used in cmd doesn't support Unicode characters.)
> 
> The workaround was to do:
> 
> 
> print(mystring.encode(sys.stdout.encoding, 
> errors='replace).decode(sys.stdout.encoding))
> 
> 
> Not fun, especially since this was *just* a debug print.
> 
> The proposal: why not add an 'errors' argument to print? That way, I could've 
> just done:
> 
> 
> print(mystring, errors='replace')
> 
> 
> without having to worry about it crashing.
> 
> --
> Ryan (ライアン)
> Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else
> http://refi64.com
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-26 Thread Nick Coghlan
On 27 March 2017 at 13:10, Steve Dower  wrote:
> On 26Mar2017 0707, Nick Coghlan wrote:
>>
>> Perhaps it would be worth noting in the table of error handlers at
>> https://docs.python.org/3/library/codecs.html#error-handlers that
>> backslashreplace is used by the `ascii()` builtin and the associated
>> format specifiers
>
> backslashreplace is also the default errors for stderr, which is arguably
> the right target for debugging output. Perhaps what we really want is a
> shorter way to send output to stderr? Though I guess it's an easy to invent
> one-liner, once you know about the difference:
>
 printe = partial(print, file=sys.stderr)

If there was a printerror builtin that used sys.stderr as its default
output stream, it could also special case BaseException instances to
show their traceback.

At the moment, we do force people to learn a few additional concepts
in order to do error display "right":

- processes have two standard output streams, stdout and stderr
- Python makes those available in the sys module
- the print() builtin function lets you specify a stream with "file"
- so errors should be printed with "print(arg, file=sys.stderr)"
- to get exception tracebacks like those at the interactive prompt,
look at the traceback module

As opposed to "for normal output, use 'print', for error output, use
'printerror', for temporary debugging output also use 'printerror',
otherwise use the logging module".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-26 Thread Steve Dower

On 26Mar2017 0707, Nick Coghlan wrote:

Perhaps it would be worth noting in the table of error handlers at
https://docs.python.org/3/library/codecs.html#error-handlers that
backslashreplace is used by the `ascii()` builtin and the associated
format specifiers


backslashreplace is also the default errors for stderr, which is 
arguably the right target for debugging output. Perhaps what we really 
want is a shorter way to send output to stderr? Though I guess it's an 
easy to invent one-liner, once you know about the difference:


>>> printe = partial(print, file=sys.stderr)

Also worth noting that Python 3.6 supports Unicode characters on the 
console by default on Windows. So unless sys.stdout was manually 
constructed (a possibility, given this was a GUI app, though I designed 
the change such that `open("CON", "w")` would get it right), there 
wouldn't have been an encoding issue in the first place.


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-26 Thread Michel Desmoulin
Yes Python is turing complete, there is always a solution to everything.
You can also do decorators with func = wrapper(func) instead of
@wrapper, no need for a new syntax.

Le 26/03/2017 à 20:42, Chris Angelico a écrit :
> On Mon, Mar 27, 2017 at 5:22 AM, Michel Desmoulin
>  wrote:
>>
>>
>> Le 26/03/2017 à 10:31, Victor Stinner a écrit :
>>> print(msg) calls sys.stdout.write(msg): write() expects text, not bytes.
>>
>> What you are saying right now is that the API is not granular enough to
>> just add a parameter. Not that it can't be done. It just mean we need to
>> expose stdout.write() encoding behavior.
>>
>>> I dislike the idea of putting encoding options in print. It's too
>>> specific. What if tomorrow you replace print() with file.write()? Do you
>>> want to add errors there too?
>>
>> You would have to rewrite all your calls anyway, because print() call
>> str() on things and accept already many parameters while file.write()
>> doesn't.
> 
> You can easily make a wrapper around print(), though. For example,
> suppose you want a timestamped log file as well as the console:
> 
> from builtins import print as pront # mess with people
> @functools.wraps(pront)
> def print(*a, **kw):
> if "file" not in kw:
> logging.info(kw.get("sep", " ").join(a))
> return pront(*a, **kw)
> 
> Now what happens if you add the errors handler? Does this function
> need to handle that somehow?
> 
> ChrisA
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> 
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-26 Thread Chris Angelico
On Mon, Mar 27, 2017 at 5:22 AM, Michel Desmoulin
 wrote:
>
>
> Le 26/03/2017 à 10:31, Victor Stinner a écrit :
>> print(msg) calls sys.stdout.write(msg): write() expects text, not bytes.
>
> What you are saying right now is that the API is not granular enough to
> just add a parameter. Not that it can't be done. It just mean we need to
> expose stdout.write() encoding behavior.
>
>> I dislike the idea of putting encoding options in print. It's too
>> specific. What if tomorrow you replace print() with file.write()? Do you
>> want to add errors there too?
>
> You would have to rewrite all your calls anyway, because print() call
> str() on things and accept already many parameters while file.write()
> doesn't.

You can easily make a wrapper around print(), though. For example,
suppose you want a timestamped log file as well as the console:

from builtins import print as pront # mess with people
@functools.wraps(pront)
def print(*a, **kw):
if "file" not in kw:
logging.info(kw.get("sep", " ").join(a))
return pront(*a, **kw)

Now what happens if you add the errors handler? Does this function
need to handle that somehow?

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-26 Thread Michel Desmoulin


Le 26/03/2017 à 10:31, Victor Stinner a écrit :
> print(msg) calls sys.stdout.write(msg): write() expects text, not bytes.

What you are saying right now is that the API is not granular enough to
just add a parameter. Not that it can't be done. It just mean we need to
expose stdout.write() encoding behavior.

> I dislike the idea of putting encoding options in print. It's too
> specific. What if tomorrow you replace print() with file.write()? Do you
> want to add errors there too?

You would have to rewrite all your calls anyway, because print() call
str() on things and accept already many parameters while file.write()
doesn't.

> 
> No, it's better to write own formatter function as shown in a previous
> email.

print(encoding) is short, easy to use, unobtrusive and will be used
ponctually.

How is that using your own formatter function better ?
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-26 Thread Ryan Gonzalez
FWIW, using the ascii function does have the problem that Unicose
characters will be escaped, even if the terminal could have handled them
perfectly fine.

--
Ryan (ライアン)
Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else
http://refi64.com

On Mar 26, 2017 9:07 AM, "Nick Coghlan"  wrote:

> On 26 March 2017 at 18:31, Victor Stinner 
> wrote:
> > print(msg) calls sys.stdout.write(msg): write() expects text, not bytes.
> I
> > dislike the idea of putting encoding options in print. It's too specific.
> > What if tomorrow you replace print() with file.write()? Do you want to
> add
> > errors there too?
> >
> > No, it's better to write own formatter function as shown in a previous
> > email.
>
> While I agree with that, folks that are thinking in terms of errors
> handlers for str.encode may not immediately jump to using the
> `ascii()` builtin or the "%a" or "!a" format specifiers, and if you
> don't use those existing tools, you have the hassle of deciding where
> to put your custom helper function.
>
> Perhaps it would be worth noting in the table of error handlers at
> https://docs.python.org/3/library/codecs.html#error-handlers that
> backslashreplace is used by the `ascii()` builtin and the associated
> format specifiers, as well as noting the format specifiers in the
> documentation of the builtin function?
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-26 Thread Nick Coghlan
On 26 March 2017 at 18:31, Victor Stinner  wrote:
> print(msg) calls sys.stdout.write(msg): write() expects text, not bytes. I
> dislike the idea of putting encoding options in print. It's too specific.
> What if tomorrow you replace print() with file.write()? Do you want to add
> errors there too?
>
> No, it's better to write own formatter function as shown in a previous
> email.

While I agree with that, folks that are thinking in terms of errors
handlers for str.encode may not immediately jump to using the
`ascii()` builtin or the "%a" or "!a" format specifiers, and if you
don't use those existing tools, you have the hassle of deciding where
to put your custom helper function.

Perhaps it would be worth noting in the table of error handlers at
https://docs.python.org/3/library/codecs.html#error-handlers that
backslashreplace is used by the `ascii()` builtin and the associated
format specifiers, as well as noting the format specifiers in the
documentation of the builtin function?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-26 Thread Victor Stinner
print(msg) calls sys.stdout.write(msg): write() expects text, not bytes. I
dislike the idea of putting encoding options in print. It's too specific.
What if tomorrow you replace print() with file.write()? Do you want to add
errors there too?

No, it's better to write own formatter function as shown in a previous
email.

Victor

Le 25 mars 2017 8:50 PM, "Michel Desmoulin"  a
écrit :



Le 24/03/2017 à 17:37, Victor Stinner a écrit :
> *If* we change something, I would prefer to modify sys.stdout. The
> following issue proposes to add
> sys.stdout.set_encoding(errors='replace'):
> http://bugs.python.org/issue15216
>
> You can already set the PYTHONIOENCODING environment variable to
> ":replace" to use "replace" on sys.stdout (and sys.stderr).
>
> Victor

This is not the same. You may want to locally apply "errors=replace" and
not the whole program.

Indeed, this can silence encoding problems. So I would probably never
set in to errors at dev time except for the few places where I know I
can explicitly silence errors.

I quite like this print(errors="replace|ignore"). This is not going to
cause any trouble, and can only help.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-25 Thread Michel Desmoulin


Le 24/03/2017 à 17:37, Victor Stinner a écrit :
> *If* we change something, I would prefer to modify sys.stdout. The
> following issue proposes to add
> sys.stdout.set_encoding(errors='replace'):
> http://bugs.python.org/issue15216
> 
> You can already set the PYTHONIOENCODING environment variable to
> ":replace" to use "replace" on sys.stdout (and sys.stderr).
> 
> Victor

This is not the same. You may want to locally apply "errors=replace" and
not the whole program.

Indeed, this can silence encoding problems. So I would probably never
set in to errors at dev time except for the few places where I know I
can explicitly silence errors.

I quite like this print(errors="replace|ignore"). This is not going to
cause any trouble, and can only help.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-24 Thread Steven D'Aprano
On Fri, Mar 24, 2017 at 10:41:58AM -0500, Ryan Gonzalez wrote:
> Recently, I was working on a Windows GUI application that ends up running
> ffmpeg, and I wanted to see the command that was being run. However, the
> file name had a Unicode character in it (it's a Sawano song), and when I
> tried to print it to the console, it crashed during the encode/decode. (The
> encoding used in cmd doesn't support Unicode characters.)

*Crash* crash, or just an exception? If it crashed the interpreter, you 
ought to report that as a bug.

> The workaround was to do:
> 
> 
> print(mystring.encode(sys.stdout.encoding,
> errors='replace).decode(sys.stdout.encoding))

I think that this would be both simpler and more informative:

print(ascii(mystring))


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-24 Thread Paul Moore
On 24 March 2017 at 16:37, Victor Stinner  wrote:
> *If* we change something, I would prefer to modify sys.stdout. The
> following issue proposes to add
> sys.stdout.set_encoding(errors='replace'):
> http://bugs.python.org/issue15216

I thought I recalled seeing something like that discussed somewhere. I
agree that this is a better approach (even though it's not as granular
as being able to specify on an individual print statement).

> You can already set the PYTHONIOENCODING environment variable to
> ":replace" to use "replace" on sys.stdout (and sys.stderr).

That's something I didn't know. Thanks for the pointer!

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-24 Thread Guido van Rossum
On Fri, Mar 24, 2017 at 9:37 AM, Victor Stinner 
wrote:

> *If* we change something, I would prefer to modify sys.stdout. The
> following issue proposes to add
> sys.stdout.set_encoding(errors='replace'):
> http://bugs.python.org/issue15216
>

I like that.


> You can already set the PYTHONIOENCODING environment variable to
> ":replace" to use "replace" on sys.stdout (and sys.stderr).
>

Great tip, I've needed this!

-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-24 Thread Victor Stinner
*If* we change something, I would prefer to modify sys.stdout. The
following issue proposes to add
sys.stdout.set_encoding(errors='replace'):
http://bugs.python.org/issue15216

You can already set the PYTHONIOENCODING environment variable to
":replace" to use "replace" on sys.stdout (and sys.stderr).

Victor

2017-03-24 16:41 GMT+01:00 Ryan Gonzalez :
> Recently, I was working on a Windows GUI application that ends up running
> ffmpeg, and I wanted to see the command that was being run. However, the
> file name had a Unicode character in it (it's a Sawano song), and when I
> tried to print it to the console, it crashed during the encode/decode. (The
> encoding used in cmd doesn't support Unicode characters.)
>
> The workaround was to do:
>
>
> print(mystring.encode(sys.stdout.encoding,
> errors='replace).decode(sys.stdout.encoding))
>
>
> Not fun, especially since this was *just* a debug print.
>
> The proposal: why not add an 'errors' argument to print? That way, I
> could've just done:
>
>
> print(mystring, errors='replace')
>
>
> without having to worry about it crashing.
>
> --
> Ryan (ライアン)
> Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else
> http://refi64.com
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-24 Thread Paul Moore
On 24 March 2017 at 15:41, Ryan Gonzalez  wrote:
> Recently, I was working on a Windows GUI application that ends up running
> ffmpeg, and I wanted to see the command that was being run. However, the
> file name had a Unicode character in it (it's a Sawano song), and when I
> tried to print it to the console, it crashed during the encode/decode. (The
> encoding used in cmd doesn't support Unicode characters.)
>
> The workaround was to do:
>
>
> print(mystring.encode(sys.stdout.encoding,
> errors='replace).decode(sys.stdout.encoding))
>
>
> Not fun, especially since this was *just* a debug print.
>
> The proposal: why not add an 'errors' argument to print? That way, I
> could've just done:
>
>
> print(mystring, errors='replace')
>
>
> without having to worry about it crashing.

When I've hit issues like this before, I've written a helper function:

def sanitise(str, enc):
"""Ensure that str can be encoded in encoding enc"""
return str.encode(enc, errors='replace').decode(enc)

An errors argument to print would be very similar, but would only
apply to the print function, whereas I've used my sanitise function in
other situations as well.

I understand the attraction of a dedicated "just print the best
representation you can" argument to print, but I'm not sure it's a
common enough need to be worth adding like this.

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Adding an 'errors' argument to print

2017-03-24 Thread Ryan Gonzalez
Recently, I was working on a Windows GUI application that ends up running
ffmpeg, and I wanted to see the command that was being run. However, the
file name had a Unicode character in it (it's a Sawano song), and when I
tried to print it to the console, it crashed during the encode/decode. (The
encoding used in cmd doesn't support Unicode characters.)

The workaround was to do:


print(mystring.encode(sys.stdout.encoding,
errors='replace).decode(sys.stdout.encoding))


Not fun, especially since this was *just* a debug print.

The proposal: why not add an 'errors' argument to print? That way, I
could've just done:


print(mystring, errors='replace')


without having to worry about it crashing.

--
Ryan (ライアン)
Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else
http://refi64.com
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/