[issue33660] pathlib.Path.resolve() returns path with double slash when resolving a relative path in root directory

2018-05-27 Thread QbLearningPython

New submission from QbLearningPython <quib...@hotmail.com>:

I have recently found a weird behaviour while trying to resolve a relative path 
located on the root directory on a macOs.

I tried to resolve a Path('spam') and the interpreter answered 
PosixPath('//spam') —double slash for root— instead of (my) expected 
PosixPath('/spam').

I think that this is a bug.

I ran the interpreter from root directory (cd /; python). Once running the 
interpreter, this is what I did:

>>> import pathlib
>>> pathlib.Path.cwd()
PosixPath('/')
# since the interpreter has been launched from root
>>> p = pathlib.Path('spam')
>>> p
PosixPath('spam')
# just for checking
>>> p.resolve()
PosixPath('//spam')
# beware of double slash instead of single slash


I also checked the behaviour of Path.resolve() in a non-root directory (in my 
case launching the interpreter from /Applications).

>>> import pathlib
>>> pathlib.Path.cwd()
PosixPath('/Applications')
>>> p = pathlib.Path('eggs')
>>> p
PosixPath('eggs')
>>> p.resolve()
PosixPath('/Applications/eggs')
# just one slash as root in this case (as should be)

So it seems that double slashes just appear while resolving relative paths in 
the root directory.

More examples are:

>>> pathlib.Path('spam/egg').resolve()
PosixPath('//spam/egg')
>>> pathlib.Path('./spam').resolve()
PosixPath('//spam')
>>> pathlib.Path('./spam/egg').resolve()
PosixPath('//spam/egg')

but

>>> pathlib.Path('').resolve()
PosixPath('/')
>>> pathlib.Path('.').resolve()
PosixPath('/')

Intriguingly,

>>> pathlib.Path('spam').resolve().resolve()
PosixPath('/spam')
# 'spam'.resolve = '//spam'
# '//spam'.resolve = '/spam'!!!
>>> pathlib.Path('//spam').resolve()
PosixPath('/spam')

I have found the same behaviour in several Python versions:

Python 3.6.5 (default, May 15 2018, 08:20:57)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.1)] on darwin

Python 3.4.8 (default, Mar 29 2018, 16:18:25)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin

Python 3.5.5 (default, Mar 29 2018, 16:22:58)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin

Python 3.7.0b4 (default, May 4 2018, 22:01:49)
[Clang 9.1.0 (clang-902.0.39.1)] on darwin


All running on: macOs High Sierra 10.13.4 (17E202)


There is also confirmation of same issue on Ubuntu 16.04 (Python 3.5.2) and 
Opensuse tumbleweed (Python 3.6.5)


I have searched for some information on this issue but I did not found anything 
useful.

Python docs (https://docs.python.org/3/library/pathlib.html) talks about "UNC 
shares" but this is not the case (in using a macOs HFS+ filesystem).

PEP 428 (https://www.python.org/dev/peps/pep-0428/) says:


Multiple leading slashes are treated differently depending on the path 
flavour. They are always retained on Windows paths (because of the UNC 
notation):

>>> PureWindowsPath('//some/path')
PureWindowsPath('//some/path/')

On POSIX, they are collapsed except if there are exactly two leading 
slashes, which is a special case in the POSIX specification on pathname 
resolution [8] (this is also necessary for Cygwin compatibility):

>>> PurePosixPath('///some/path')
PurePosixPath('/some/path')
>>> PurePosixPath('//some/path')
PurePosixPath('//some/path')


I do not think that this is related to the aforementioned issue.

However, I also checked the POSIX specification link 
(http://pubs.opengroup.org/onlinepubs/009...#tag_04_11) and found:

A pathname that begins with two successive slashes may be interpreted in an 
implementation-defined manner, although more than two leading slashes shall be 
treated as a single slash.


I do not really think that this can cause a double slashes while resolving a 
relative path on macOs.


So, I think that this issue could be a real bug in pathlib.Path.resolve() 
method. Specifically on POSIX flavour.

A user of Python Forum (killerrex) and I have traced the bugs to 
Lib/pathlib.py:319 in the Python 3.6 repository 
https://github.com/python/cpython/blob/3...pathlib.py.

Specifically, in line 319:

newpath = path + sep + name

For pathlib.Path('spam').resolve() in the root directory, newpath is '//spam' 
since:

path is '/'
sep is '/'
name is 'spam'

killerrex has suggested two solutions:

1) from line 345 

base = '' if path.is_absolute() else os.getcwd()
if base == sep:
base = ''
return _resolve(base, str(path)) or sep

2) from line 319:

if path.endswith(sep):
newpath = path + name
else:
newpath = path + sep + name


Thank you.

--
components: Library (Lib)
messages: 317790
nosy: QbLearningPython
priority: normal
severity: normal
status: open
title: pathlib.Path.resolve() returns path with double slash when resolving a 
relative path

[issue32040] Sorting pahtlib.Paths does give the same order as sorting the (string) filenames of that pathlib.Paths

2017-11-16 Thread QbLearningPython

QbLearningPython <quib...@hotmail.com> added the comment:

Thanks, serhiy.storchaka, for your answer.

I am not fully convinced.

You have described the current behaviour of the pathlib package.

But let me ask: should be this the desired behaviour?

Since string filenames and pathlib.Paths are different ways to refer to the 
same object (a path in a filesystem), should not be they behaved in the same 
way when sorting?

You pointed out that the current behaviour is "more natural order" for 
pathlib.Paths. I am not truly sure about that. Can you please provide any 
citation or additional information about that?

Thank you.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32040] Sorting pahtlib.Paths does give the same order as sorting the (string) filenames of that pathlib.Paths

2017-11-15 Thread QbLearningPython

New submission from QbLearningPython <quib...@hotmail.com>:

While testing a module, I have found a weird behaviour of pathlib package. I 
have a list of pathlib.Paths and I sorted() it. I assumed that the order 
retrieved by sorting a list of Paths would be the same as the order retrieved 
by sorting the list of their corresponding (string) filenames. But it is not 
the case.

I run the following example:


==


from pathlib import Path

# order string filenames

filenames_for_testing = (
'/spam/spams.txt',
'/spam/spam.txt',
'/spam/another.txt',
'/spam/binary.bin',
'/spam/spams/spam.ttt',
'/spam/spams/spam01.txt',
'/spam/spams/spam02.txt',
'/spam/spams/spam03.ppp',
'/spam/spams/spam04.doc',
)

sorted_filenames = sorted(filenames_for_testing)

# output ordered list of string filenames

print()
print("Ordered list of string filenames:")
print()
[print(f'\t{element}') for element in sorted_filenames]
print()

# order paths (build from same string filenames)

paths_for_testing = [
Path(filename)
for filename in filenames_for_testing
]
sorted_paths = sorted(paths_for_testing)

# outoput ordered list of pathlib.Paths

print()
print("Ordered list of pathlib.Paths:")
print()
[print(f'\t{element}') for element in sorted_paths]
print()

# compare

print()

if sorted_filenames == [str(path) for path in sorted_paths]:
print('Ordered lists of string filenames and pathlib.Paths are EQUAL.')

else:
print('Ordered lists of string filenames and pathlib.Paths are DIFFERENT.')

for element in range(0, len(sorted_filenames)):

if sorted_filenames[element] != str(sorted_paths[element]):

print()
print('First different element:')
print(f'\tElement #{element}')
print(f'\t{sorted_filenames[element]} != {sorted_paths[element]}')
break

print()



==


The output of this script was:


==

Ordered list of string filenames:

/spam/another.txt
/spam/binary.bin
/spam/spam.txt
/spam/spams.txt
/spam/spams/spam.ttt
/spam/spams/spam01.txt
/spam/spams/spam02.txt
/spam/spams/spam03.ppp
/spam/spams/spam04.doc


Ordered list of pathlib.Paths:

/spam/another.txt
/spam/binary.bin
/spam/spam.txt
/spam/spams/spam.ttt
/spam/spams/spam01.txt
/spam/spams/spam02.txt
/spam/spams/spam03.ppp
/spam/spams/spam04.doc
/spam/spams.txt


Ordered lists of string filenames and pathlib.Paths are DIFFERENT.

First different element:
Element #3
/spam/spams.txt != /spam/spams/spam.ttt


==


As you can see, 'spam/spams.txt' goes in different places if you have sorted by 
pathlib.Paths than if you have sorted by string filenames.

I think that it is weird that sorting pathlib.Paths yields a different result 
than sorting their string filenames. I think that pathlib.Paths should be 
ordered by alphabetical order of their corresponding filenames.

Thank you.

--
components: Extension Modules
messages: 306304
nosy: QbLearningPython
priority: normal
severity: normal
status: open
title: Sorting pahtlib.Paths does give the same order as sorting the (string) 
filenames of that pathlib.Paths
type: behavior
versions: Python 3.6

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com