[issue22299] resolve() on Windows makes some pathological paths unusable

2021-02-23 Thread Eryk Sun


Eryk Sun  added the comment:

ntpath.realpath() has since been implemented, and it addresses the problem by 
keeping the extended (verbatim) path prefix in the result if the input argument 
has it, and otherwise removing the prefix if the final path resolves correctly 
without it.

--
versions: +Python 3.10, Python 3.8, Python 3.9 -Python 3.4, Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-11 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 Really, the test for whether to keep or remove the prefix should be to 
 remove the prefix and try and resolve the path again. If it succeeds, 
 remove the prefix; otherwise, keep it. This can only really be done as 
 part of the resolve() call, which would address the original issue,
 but it may be quite a perf. hit. 

It would also be prone to race conditions. All in all it sounds like a bad idea.
I still think it should be asked for explicitly. I don't know how the method 
should be called, .extended() perhaps?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-11 Thread Steve Dower

Steve Dower added the comment:

Another alternative is to always leave the prefix there after calling resolve() 
(as opposed to the current behaviour which is to always remove it). If the 
Win32 API says that the path should include the prefix, then it should. There's 
no reliable way for a developer to decide that an arbitrary path should include 
the prefix other than by resolving it.

I still like the idea of a format character to omit the prefix, as that 
correctly implies that the only reason you should remove it is for displaying 
to the user. Alternatively, a .without_prefix property seems like a safer 
route than requiring the user to add it. Long paths are the only time you may 
want to add it, but even that doesn't guarantee that the path will work.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-08 Thread Steve Dower

Steve Dower added the comment:

Right, what the prefix actually means is treat this path as a blob and don't 
do any processing. Some of the things that 'processing' includes are:

* CWD
* invalid names ('foo.' - 'foo')
* adjacent backslashes ('a\\b' - 'a\b')
* forward slashes ('a/b' - 'a\b')
* (probably) short/long file names ('progra~1' - 'Program Files')

A nice side-effect is that you can also use path names longer than 260 
characters, provided your path name is correctly normalized already.

Really, the test for whether to keep or remove the prefix should be to remove 
the prefix and try and resolve the path again. If it succeeds, remove the 
prefix; otherwise, keep it. This can only really be done as part of the 
resolve() call, which would address the original issue, but it may be quite a 
perf. hit. 

I'd still be inclined to add the prefix in str() if the final path length is 
greater than 260 characters, if only because we go from zero chance of it 
working to a non-zero chance. Unfortunately, there seems to be no way to 
process a long path to make it 'safe' to add the prefix (though we can do a few 
of the things and increase the chances) as GetFinalPathName will not work on a 
long path. FWIW, paths longer than 260 chars are a mess and everyone knows it, 
but it's really really hard to fix without breaking back-compat.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-07 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I'm a bit worried about the consequences still. If you take C: and join it 
with e.g. foo, you get a drive-relative path. If you take //?/C:/ and join 
it with e.g. foo, you get an absolute path (or, if you remove the drive's 
trailing slash, you get something that's invalid AFAIK).

So the question is: how implicit/explicit will the conversion be, and at which 
stages will it happen?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-07 Thread eryksun

eryksun added the comment:

 If you take //?/C:/ and join it with e.g. foo, you get an 
 absolute path (or, if you remove the drive's trailing slash, you 
 get something that's invalid AFAIK).

FYI, DOS device names such as C: are NT symlinks. Win32 looks for DOS device 
links in NT's \Global?? directory, and also per logon-session in 
\Sessions\0\DosDevices\LOGON_ID. 

A link named C:foo is actually possible:

 windll.kernel32.DefineDosDeviceW(0, C:foo, C:\\Python34)
1
 gfpn = os.path._getfinalpathname
 gfpn(r'\\?\C:foo')
'?\\C:\\Python34'
 os.listdir(r'\\?\C:foo')
['DLLs', 'Doc', 'include', 'Lib', 'libs', 'LICENSE.txt', 'NEWS.txt', 
'python.exe', 'pythonw.exe', 'README.txt', 'Scripts', 'tcl', 'Tools']

GLOBALROOT links to the native NT root:

 gfpn('?\\GLOBALROOT\\Global??\C:\\')
'?\\C:\\'
 gfpn('?\\GLOBALROOT\\Device\\HarddiskVolume1\\')   
'?\\C:\\'
 gfpn(r'\\?\GLOBALROOT\SystemRoot') 
'?\\C:\\Windows'
 p = r'\\?\GLOBALROOT\Sessions\0\DosDevices\-0f341de9\C:foo' 
   
 gfpn(p)
'?\\C:\\Python34'

Without the \\?\ prefix, C:foo is relative to the current directory on the C: 
drive:

 os.chdir('C:\\')
 os.mkdir('foo')
 os.listdir('C:foo')
[]

where the current directory on C: is stored in the =C: environment variable:

 buf = (c_wchar * 100)()
 windll.kernel32.GetEnvironmentVariableW(=C:, buf)
3
 buf.value
'C:\\'

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-06 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 Since both paths are valid and both paths refer to the same file, some 
 developers may find this result counterintuitive.

On the other hand the proposed solution is regular. If you input an extended 
path, you get an extended path as output.

There are other factors that can come into play, such as hard links under Unix 
(and perhaps under Windows too). The recommended way to check if two paths 
point to the same file is still os.path.samefile().

Another approach would be for pathlib to *always* use extended paths internally 
on Windows absolute paths; I don't know which side effects that could have, 
though.

Note we could also add methods to switch from the extended to the regular form 
and vice-versa.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-06 Thread Steve Dower

Steve Dower added the comment:

Actually, I'd be inclined to never use the prefix within pathlib and then add 
it on if necessary when converting to a string. That would also solve a problem 
like:

 p = Path(C:\\) / ('a'*150) / ('a'*150)
 p.stat()
FileNotFoundError: [WinError 3] The system cannot find the path specified: ...
 p2 = Path(?\\ + str(p))
 p2.stat()
os.stat_result(...)

The hardest part about this is knowing with certainty whether it's needed. We 
can certainly detect most cases automatically.

Maybe we also need an extra method or a format character to force a str() with 
prefix? Or maybe having an obvious enough function that people can monkey-patch 
if necessary - the \\?\ prefix is an edge case for most people already, and 
checking the total length would bring that to 99% IME.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-06 Thread Steve Dower

Steve Dower added the comment:

If anyone wanted to test that really long path, here's the incantation to 
create it:

 import os, pathlib
 os.mkdir(C:\\a)
 os.mkdir(C:\\a\\ + a*150)
 os.rename(C:\\a, C:\\ + a*150)
 p = pathlib.Path(C:\\) / (a*150) / (a*150)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-06 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 I'd be inclined to never use the prefix within pathlib and then add it on if 
 necessary when converting to a string

That may be very surprising when that prefix appears, though... At least with 
explicit methods the user would have to invoke them, instead of getting 
unexpected results implicitly.

I don't know what Windows users think about all this, though (I uses Linux 
myself).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-06 Thread Steve Dower

Steve Dower added the comment:

It's no less expected than having OS functions fail because the path is too 
long.

Using it to maintain dots at the end of directory/file names is a little less 
safe and may break some code. Maybe pathlib should strip these if there is no a 
prefix? (For example, C:\Test. == C:\Test == \\?\C:\Test != 
\\?\C:\Test.)

If most (or all) of the file handling functions in Python are using *W() APIs 
and can support the prefix, I'd rather add it in silently if only to avoid the 
long path issue. It's really the sort of implementation detail that pathlib 
should be able to hide from the app developer and the user (Node.js does this, 
for example, as its node_modules hierarchies regularly exceed the max path 
limitation).

Maybe the best approach is to preserve the prefix if it already exists, and add 
it if it becomes necessary. File operations are most likely to succeed in this 
case, even if it may be surprising to users.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-06 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 If most (or all) of the file handling functions in Python are using *W() APIs 
 and can support the prefix, I'd rather add it in silently if only to avoid 
 the long path issue.

This would only work for fully-qualified paths, right? Not relative ones.

I'm all for making things higher-level, I just want to make sure it won't break 
existing use cases :-)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-06 Thread Steve Dower

Steve Dower added the comment:

 This would only work for fully-qualified paths, right? Not relative ones.

Correct, and I think we're most of the way there with how drives are handled. 
Since the prefix only works with absolute paths, why not treat it as part of 
the drive name?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-05 Thread Steve Dower

Steve Dower added the comment:

Patch attached. (Kinda feel like this was too simple...)

--
keywords: +patch
Added file: http://bugs.python.org/file36549/22299_1.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-05 Thread eryksun

eryksun added the comment:

It should only skip _ext_to_normal for an already extended path, i.e. a path 
that starts with ext_namespace_prefix. Otherwise it needs to call 
_ext_to_normal. For example:

Strip the prefix in this case:

 os.path._getfinalpathname('C:\\Windows')   
'?\\C:\\Windows'

but not in this case:

 
os.path._getfinalpathname(r'\\?\GLOBALROOT\Device\HarddiskVolume1\Windows')
'?\\C:\\Windows'

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-05 Thread Steve Dower

Steve Dower added the comment:

Ah, thought it was too simple. I didn't realise that _getfinalpathname adds the 
prefix.

New patch soon.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-05 Thread Steve Dower

Steve Dower added the comment:

Strips the prefix if it wasn't in the original path - otherwise, keeps it.

--
Added file: http://bugs.python.org/file36550/22299_2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-05 Thread Antoine Pitrou

Antoine Pitrou added the comment:

As far as I can say, the patch looks fine to me. Thanks, Steve.

--
stage: needs patch - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-05 Thread Kevin Norris

Kevin Norris added the comment:

I'm a little concerned about this fix.  In particular, if I understand the 
design of the patch correctly, it is intended to produce this behavior:

Path('C:/foo').resolve() != Path('//?/C:/foo').resolve()

Since both paths are valid and both paths refer to the same file, some 
developers may find this result counterintuitive.  The Path.resolve() docs do 
not expressly forbid it, however.

How many developers assume Path.resolve() is always the same for the same file?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-09-05 Thread eryksun

eryksun added the comment:

Maybe for an extended path it could try _getfinalpathname without the prefix. 
If it isn't a valid path or the result isn't the same as _getfinalpathname 
including the prefix, then skip calling _ext_to_normal. For example:

def resolve(self, path):
s = str(path)
if not s:
return os.getcwd()
if _getfinalpathname is not None:
prefix, t = self._split_extended_path(s)
s = _getfinalpathname(s)
if prefix:
try:
if _getfinalpathname(t) != s:
return s
except FileNotFoundError:
return s
return self._ext_to_normal(s)
# Means fallback on absolute
return None

The 'foo.' path in this issue would keep the prefix:

 Path('//?/C:/foo.').resolve()
WindowsPath('//?/C:/foo.')
 Path('//?/UNC/server/C$/foo.').resolve()
WindowsPath('//?/UNC/server/C$/foo.')

But regular paths would remove the prefix:

 Path('//?/C:/bar').resolve()
WindowsPath('C:/bar')
 Path('//?/UNC/server/C$/bar').resolve() 
WindowsPath('//server/C$/bar')

On a related note, _split_extended_path only looks for uppercase UNC, which 
makes the above resolve method fail:

 Path('//?/unc/server/C$/bar').resolve()
WindowsPath('//?/UNC/server/C$/bar')

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-08-29 Thread Kevin Norris

New submission from Kevin Norris:

Run Python as an administrator:

 import pathlib 
 pth = pathlib.Path('//?/C:/foo.')
 pth.mkdir()
 pth.resolve().rmdir()
Traceback (most recent call last):
  File stdin, line 1, in module
  File C:\Python34\lib\pathlib.py, line 1141, in rmdir
self._accessor.rmdir(self)
  File C:\Python34\lib\pathlib.py, line 323, in wrapped
return strfunc(str(pathobj), *args)
FileNotFoundError: [WinError 2] The system cannot find the file specified: 
'C:\\foo.'
 pth.rmdir()

You do not need to be an administrator so long as you can create a directory in 
the requested location, but the \\?\ prefix only works with absolute paths so 
it's easier to demonstrate in the root of the drive.

--
components: Library (Lib), Windows
messages: 226060
nosy: Kevin.Norris
priority: normal
severity: normal
status: open
title: resolve() on Windows makes some pathological paths unusable
type: behavior
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-08-29 Thread Kevin Norris

Kevin Norris added the comment:

When the directory name is '...', the error is different:

 pth = pathlib.Path('//?/C:/...')
 pth.mkdir()
 pth.resolve().rmdir()
Traceback (most recent call last):
  File stdin, line 1, in module
  File C:\Python34\lib\pathlib.py, line 1141, in rmdir
self._accessor.rmdir(self)
  File C:\Python34\lib\pathlib.py, line 323, in wrapped
return strfunc(str(pathobj), *args)
PermissionError: [WinError 5] Access is denied: 'C:\\...'
 pth.rmdir()


--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-08-29 Thread Terry J. Reedy

Changes by Terry J. Reedy tjre...@udel.edu:


--
nosy: +pitrou
stage:  - test needed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-08-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Why is it a pathological path? Can you explain?

--
nosy: +steve.dower, tim.golden, zach.ware

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-08-29 Thread Steve Dower

Steve Dower added the comment:

Is resolve() using an *A() API rather than *W()? The \\?\ prefix does not work 
with *A() APIs.

Also, names that are all dots are not supported by Windows at all. I'd expect 
mkdir() to fail on that, but the \\?\ prefix disables some validation, so it's 
possible that it is getting through that way.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-08-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

resolve() should use the *W APIs since it is using only functions from the os 
module with str objects. Perhaps you want to double-check that, since I don't 
have a Windows VM anymore.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-08-29 Thread eryksun

eryksun added the comment:

The \\?\ extended-path prefix bypasses normal path processing. The path is 
passed directly to the filesystem driver. For example, to accommodate the POSIX 
namespace, NTFS allows any character except NUL and slash, so it happily 
creates a directory named foo.. This name is invalid in the Win32 namespace. 

resolve() should skip calling _ext_to_normal on the result of _getfinalpathname 
if the input path is extended.

http://hg.python.org/cpython/file/c0e311e010fc/Lib/pathlib.py#l178

--
nosy: +eryksun

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-08-29 Thread Steve Dower

Steve Dower added the comment:

Removing the _ext_to_normal() call in resolve() looks like the right fix to me.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22299] resolve() on Windows makes some pathological paths unusable

2014-08-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Does one of you want to provide a patch? (with tests)

--
stage: test needed - needs patch
versions: +Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22299
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com