[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-11-06 Thread Roundup Robot

Roundup Robot added the comment:

New changeset b26c8104e54f by Steve Dower in branch '3.6':
Closes #27781: Removes special cases for the experimental aspect of PEP 529
https://hg.python.org/cpython/rev/b26c8104e54f

New changeset b8233c779ff7 by Steve Dower in branch 'default':
Closes #27781: Removes special cases for the experimental aspect of PEP 529
https://hg.python.org/cpython/rev/b8233c779ff7

--
resolution:  -> fixed
stage: needs patch -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-10-31 Thread Steve Dower

Steve Dower added the comment:

Before 3.6.0 beta 4 I need to make this change permanent. From memory, it's 
just an exception message that needs changing (and PEP 529 becomes final), but 
I'll review the changeset to be sure.

--
nosy: +ned.deily
priority: normal -> release blocker
stage: commit review -> needs patch
versions: +Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-09 Thread Mark Dickinson

Mark Dickinson added the comment:

That seems to have done the trick. Thanks!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-09 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 801634d3c105 by Steve Dower in branch 'default':
Issue #27781: Fixes uninitialized fd when !MS_WINDOWS and !HAVE_OPENAT
https://hg.python.org/cpython/rev/801634d3c105

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-09 Thread Mark Dickinson

Mark Dickinson added the comment:

It looks as though this change in posixmodule.c is the cause:

 #ifdef MS_WINDOWS
-if (path->wide)
-fd = _wopen(path->wide, flags, mode);
-else
+fd = _wopen(path->wide, flags, mode);
 #endif
 #ifdef HAVE_OPENAT
 if (dir_fd != DEFAULT_DIR_FD)
 fd = openat(dir_fd, path->narrow, flags, mode);
 else
-#endif
 fd = open(path->narrow, flags, mode);
+#endif


The move of the final #endif means that `fd` is not defined on OS X. If I move 
the #endif back again, the compile succeeds.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-09 Thread Mark Dickinson

Mark Dickinson added the comment:

It looks as though this change might have broken the compile on OS X. On my OS 
X 10.9 machine, building from a clean Git checkout of the master branch fails; 
the tail of the failed build looks like this:

./python.exe -E -S -m sysconfig --generate-posix-vars ;\
if test $? -ne 0 ; then \
echo "generate-posix-vars failed" ; \
rm -f ./pybuilddir.txt ; \
exit 1 ; \
fi
Fatal Python error: Py_Initialize: unable to load the file system codec
Traceback (most recent call last):
  File "", line 962, in _find_and_load
  File "", line 951, in _find_and_load_unlocked
  File "", line 656, in _load_unlocked
  File "", line 668, in exec_module
  File "", line 782, in get_code
  File "", line 842, in _cache_bytecode
  File "", line 867, in set_data
  File "", line 117, in _write_atomic
ValueError: negative file descriptor
/bin/sh: line 1: 35829 Abort trap: 6   ./python.exe -E -S -m sysconfig 
--generate-posix-vars
generate-posix-vars failed
make: *** [pybuilddir.txt] Error 1

Full build output attached.

--
nosy: +mark.dickinson
Added file: http://bugs.python.org/file44491/osx_failed_compile.txt

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-08 Thread Roundup Robot

Roundup Robot added the comment:

New changeset faca0730270b by Steve Dower in branch 'default':
Fixes tests broken by issue #27781.
https://hg.python.org/cpython/rev/faca0730270b

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-08 Thread Roundup Robot

Roundup Robot added the comment:

New changeset e20c7d8a8187 by Steve Dower in branch 'default':
Issue #27781: Change file system encoding on Windows to UTF-8 (PEP 529)
https://hg.python.org/cpython/rev/e20c7d8a8187

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-08 Thread Steve Dower

Steve Dower added the comment:

This is pushed now - let the bug fixing begin :)

--
stage: patch review -> commit review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-08 Thread Steve Dower

Steve Dower added the comment:

Thanks for that review, Eryk, but I'm going to defer those to other issues 
(specifically issue27998 for scandir and we should file a new issue for the 
symlink concerns).

I've got some more doc updates to do though, and then I'll check in if there 
are no other concerns.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-07 Thread Steve Dower

Steve Dower added the comment:

One minor change - I removed the unused definition of 
Py_FileSystemDefaultDecodeErrors.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-07 Thread Steve Dower

Steve Dower added the comment:

PEP 529 has been accepted, so this really needs a review now. But since it's 
experimental and all the tests pass, I'll be committing it shortly anyway and 
will be tidying up issues during beta.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-06 Thread Steve Dower

Steve Dower added the comment:

Also see PEP 529 for the latest updates there.

This is likely to be accepted as experimental for 3.6.0b1-3, and we'll commit 
to either the new default or a compatible default for b4.

--
Added file: http://bugs.python.org/file44414/27781_1.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-09-05 Thread Nick Coghlan

Nick Coghlan added the comment:

I belatedly remembered I've had this new test case hanging around for a while, 
and never got around to getting it into shape for inclusion in the standard 
library.

With the prospect of reasonable cross-platform consistency in this area, it 
could be a good thing to add as part of this PEP.

--
nosy: +ncoghlan
Added file: http://bugs.python.org/file44369/test_cmd_line_unicode.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-17 Thread Brett Cannon

Changes by Brett Cannon :


--
nosy: +brett.cannon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-17 Thread Steve Dower

Steve Dower added the comment:

Ah I see, if we end up sticking with MBCS and offering a switch to enable 
UTF-8. In that case, we'll definitely ensure the flag is the same (but I'm 
hopeful we will just make the reliable behavior on Windows the default, so it 
won't matter).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-17 Thread STINNER Victor

STINNER Victor added the comment:

Steve Dower added the comment:
> By portable, do you mean not using an environment variable?

I mean that "python3 -X utf8" should force sys.getfilesystemencoding()
to UTF-8 on UNIX/BSD, it would ignore the current locale setting.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-17 Thread Steve Dower

Steve Dower added the comment:

By portable, do you mean not using an environment variable?

Command line parsing is potentially affected by this on Windows - I'd have to 
look deeper - as command lines are provided as UTF-16. But we may not ever 
expose them as bytes.

I don't even know that this matters on the UNIX/BSD side as the file system 
encoding provided there is correct, no? It's just Windows where the file system 
encoding used for bytes doesn't match what the file system actually uses.

I was afraid a PEP would be necessary out of this, but I want to see how the 
python-dev discussion goes first.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-17 Thread STINNER Victor

STINNER Victor added the comment:

> I suspect we'll have to go to Guido to get a ruling on the default, but I'll 
> add an environment variable to switch.

If you go in this direction, I would like to follow you for the
UNIX/BSD side to make the switch portable. I was thinking about "-X
utf8" which avoids to change the command line parser.

If we agree on a plan, I would like to write it down as a PEP since I
expect a lot of complains and questions which I would prefer to only
answer once (see for example the length of your thread on python-ideas
where each people repeated the same things multiple times ;-))

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-17 Thread STINNER Victor

STINNER Victor added the comment:

> Is there a surrogatepass option?

I'm talking about error handlers of Python codecs: text.encode('utf8',
'surrogatepass')

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-17 Thread Steve Dower

Steve Dower added the comment:

Thanks for the regen. I don't think git format is the problem as most of my 
patches are fine, it's probably because it was in a patch queue and so the 
parent isn't actually a known commit. I haven't tested whether this works 
without my other console patches but I think it should.

Is there a surrogatepass option? If so, I'll definitely use that, as that'll 
fix the one remaining edge case.

I suspect we'll have to go to Guido to get a ruling on the default, but I'll 
add an environment variable to switch.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-17 Thread Chi Hsuan Yen

Changes by Chi Hsuan Yen :


--
nosy: +Chi Hsuan Yen

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-17 Thread STINNER Victor

STINNER Victor added the comment:

Steve Dower: Please don't use git format for diff, or the bug tracker is unable 
to create reviews. I regenerated the patch.

By the way, you introduced a bug in posix_do_stat(): you added a new "else" in 
the !MS_WINDOWS path which leads to a compilation error. I fixed it.

--
Added file: http://bugs.python.org/file44131/fsencoding.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-17 Thread STINNER Victor

STINNER Victor added the comment:

Would it be acceptable for you to add a new option to switch to UTF-8 in Python 
3.6, and discuss later if it's ok to enable it by default?

In the python-ideas threed, you wrote that Windows allow surrogate characters 
in filenames, but not the UTF-8/strict Python codec. Would it make sense to use 
UTF-8/surrogatepass codec to avoid any unicode error?

--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-17 Thread Jeremy Kloth

Changes by Jeremy Kloth :


--
nosy: +jkloth

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-16 Thread Decorater

Decorater added the comment:

I personally hate ansi myself so +1 to UTF-8/UTF-16.

--
nosy: +Decorater

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27781] Change sys.getfilesystemencoding() on Windows to UTF-8

2016-08-16 Thread Steve Dower

New submission from Steve Dower:

I've attached my first pass at a patch to change the file system encoding on 
Windows to UTF-8 and remove use of the *A APIs.

It would be trivial to change the encoding from UTF-8 back to CP_ACP and change 
the error mode if that's what we decide is better, but my vote is strongly for 
an encoding that never drops characters when converted from UTF-16.

Discussion is still ongoing on python-ideas, so let's argue about yes/no and 
utf-8/mbcs there and just discuss the patch here.

--
assignee: steve.dower
components: Windows
files: fsencoding.diff
keywords: patch
messages: 272899
nosy: paul.moore, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
stage: patch review
status: open
title: Change sys.getfilesystemencoding() on Windows to UTF-8
type: behavior
versions: Python 3.6
Added file: http://bugs.python.org/file44130/fsencoding.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com