[issue33275] glob.glob should explicitly note that results aren't sorted

2018-11-04 Thread Julien Palard


Change by Julien Palard :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-05-02 Thread Eryk Sun

Eryk Sun  added the comment:

FAT inserts a new file entry in a directory at the first available position. 
(If it's a long filename, this could be up to 21 contiguous dirents for a 
combined long/short dirent set.) This means a directory listing is usually in 
the same order that files were added. One caveat is that dirents for deleted 
files may be reused once there are no more unused entries available in a 
cluster. (I'd expect this depends on the implementation. Also, this is less 
likely with a long filename, since it needs a large-enough contiguous block of 
dirents.) Given a volume with a 4 KiB cluster size, sans overhead there are 127 
32-byte dirents in a cluster.

I used to have an MP3 player that used FAT32 and only played files in directory 
order, so I had to resort directories on disk after adding files. In Ubuntu 
Linux, I see there's a "fatsort" package that implements this. There's probably 
a build available for MacOS.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-05-02 Thread Ben FrantzDale

Ben FrantzDale  added the comment:

I looked into it a bit more. With python 2.7 on macOS High Sierra on APFS 
(Encrypted) with a FAT32 thumb drive... I have a directory that 
glob.glob('/Volumes/thumb/tmp/*') shows as sorted. I cp -r that to /tmp with 
bash. glob.glob('/tmp/tmp/*') is now not sorted. and cp -r /tmp/tmp 
/Volumes/thumb/tmp1. Then glob.glob('/Volumes/thumb/tmp/*') shows a different 
order, but if I cp -r /Volumes/thumb/tmp/ /Volumes/thumb/tmp2 then 
glob.glob('/Volumes/thumb/tmp2/*') is sorted by file name just like 
glob.glob('/Volumes/thumb/tmp/*'). I'm not sue what that's saying other than 
that glob.glob can return things out of order on FAT32. It appears that 
glob.glob's ordering agrees with that of ls -f ("unsorted").

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-04-25 Thread Eryk Sun

Eryk Sun  added the comment:

As I said, some file systems such as NTFS and ISO 9660 (or Joliet) store 
directories in lexicographically sorted order. NTFS does this using a b-tree 
and case-insensitive comparison, which helps the driver efficiently implement 
filtering a directory listing using a pattern such as "spam*eggs?.txt". 
(Filtering of a directory listing at the syscall level is peculiar to Windows 
and not supported by Python.)

I like the phrase "arbitrary order". I don't think it's wise for an application 
to ever depend on the order. Also, we usually want natural-language collation 
for display purposes (e.g. spam2.txt should come before spam10.txt), so we have 
to sort the result regardless of the file system.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-04-24 Thread Terry J. Reedy

Terry J. Reedy  added the comment:

I agree that anything that has the same FS-determined sorted or not behavior 
should get the same note, for the same reason.  Ben, can you test?  Eryk, can 
you enlighten us further?

PS: Ben, when responding by email, please delete the quote, as it is duplicate 
noise on the web page.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-04-24 Thread Ben FrantzDale

Ben FrantzDale  added the comment:

Great point. Looks like the phrase is "in arbitrary order" in the docs for
those (both 2.7 and 3), which is better than saying nothing. I'd still
prefer a bit more specificity about the potential gotcha since "arbitrary"
seems a lot less deterministic than "some file systems will give you sorted
order, some won't".

On Tue, Apr 24, 2018 at 9:41 AM, Serhiy Storchaka 
wrote:

>
> Serhiy Storchaka  added the comment:
>
> Are there such notes in the descriptions of os.listdir(), os.scandir(),
> os.walk(), os.fwalk() and corresponding Path methods? If explicitly
> document the sorting, this should be made for all files enumerating
> functions.
>
> --
> nosy: +serhiy.storchaka
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-04-24 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

Are there such notes in the descriptions of os.listdir(), os.scandir(), 
os.walk(), os.fwalk() and corresponding Path methods? If explicitly document 
the sorting, this should be made for all files enumerating functions.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-04-24 Thread Elena Oat

Change by Elena Oat :


--
keywords: +patch
pull_requests: +6287
stage: needs patch -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-04-20 Thread Terry J. Reedy

Terry J. Reedy  added the comment:

How about adding a sentence to the end of the first paragraph.

 glob.glob(pathname, *, recursive=False)

Return a possibly-empty list of path names that match pathname, which must 
be a string containing a path specification. pathname can be either absolute 
(like /usr/src/Python-1.5/Makefile) or relative (like ../../Tools/*/*.gif), and 
can contain shell-style wildcards. Broken symlinks are included in the results 
(as in the shell).  Whether or not the results are sorted depends on the file 
system.

--
nosy: +terry.reedy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-04-13 Thread Ben FrantzDale

Ben FrantzDale  added the comment:

Fascinating. That seems like an even wilder gotcha: It sounds like a script 
assuming sorted results would work in one directory (on one filesystem) but not 
on another. Or even weirder, if I had a mounted scratch partition, the script 
could work until I (or a sys admin) mounts a larger drive with a different 
filesystem on the same mountpoint. Yikes! Either way, this gotcha seems worth 
mentioning explicitly.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-04-13 Thread Eryk Sun

Eryk Sun  added the comment:

> The sortedness of glob.glob's output is platform-dependent.

It's typically file-system dependent (e.g. NTFS, FAT, ISO9660, UDF) -- at least 
on Windows. NTFS and ISO9660 store directories in sorted order based on the 
filename (Unicode or ASCII ordinal sort).

--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-04-13 Thread Raymond Hettinger

Raymond Hettinger  added the comment:

This seems reasonable.  I would like like it to be part of the regular text 
rather rather than appearing as a big ..note entry which can be visually 
distracting from the core functionality.

--
assignee:  -> docs@python
components: +Documentation -Library (Lib)
keywords: +easy
nosy: +docs@python, rhettinger
stage:  -> needs patch
versions:  -Python 3.4, Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-04-13 Thread Raymond Hettinger

Change by Raymond Hettinger :


--
nosy: +csabella

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33275] glob.glob should explicitly note that results aren't sorted

2018-04-13 Thread Ben FrantzDale

New submission from Ben FrantzDale :

The sortedness of glob.glob's output is platform-dependent. While the docs do 
not mention sorting, and so are strictly correct, if you are on a platform 
where its output is sorted, it's easy to believe that the output is always 
sorted.

I propose we a Note maybe next to "Note: Using the “**” pattern in large 
directory trees may consume an inordinate amount of time." that says "Note: 
While the output of glob.glob may be sorted on some architectures, ordering is 
not guaranteed. Use `sort(glob.glob(...))` if ordering is important."

This wrong assumption burned us when scripts inexplicably stopped working on 
OSX High Sierra.

--
components: Library (Lib)
messages: 315254
nosy: Ben FrantzDale
priority: normal
severity: normal
status: open
title: glob.glob should explicitly note that results aren't sorted
type: enhancement
versions: Python 2.7, Python 3.4, Python 3.5, Python 3.6, Python 3.7, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com