bug#35531: problem with ls in coreutils
tag 35531 notabug close 35531 stop On 03/05/19 17:01, Peter Edwards wrote: > Hi > > Although this bug report seems to be a problem with the windows port > of ls, it reminded me of an interesting investigation into slow ls > speeds due to colorizing via the LS_COLORS environment variable. > > See > https://news.sherlock.stanford.edu/posts/when-setting-an-environment-variable-gives-you-a-40-x-speedup > > I thought it an interesting case study. Thanks for the info. In summary, to speed up ls color induced processing significantly, disable stat() and getxattr() calls with: LS_COLORS='ex=00:su=00:sg=00:ca=00:' A general point though is that colors are for human processing, and how fast can one process the output from ls :) I.E. if ls is being written to pipe/file or somewhere where speed may be important, the coloring is disabled by default anyway. cheers, Pádraig
bug#35531: problem with ls in coreutils
On 5/1/2019 3:03 PM, Viktors Berstis wrote: > When running "ls" or "ls -U" on a windows directory containing 5 > files, ls takes forever. Something seems to be highly inefficient in there. > --- it sounds like you are running ls with no options (nothing in environment and no switches on the command line). Is this the case? If is, I'm stumped unless whoever compiled that had it set to do some things by default. Basically on Windows, anything that you might get away with on linux with a stat call, takes an 'open' call on windows. That gets costly. Anything that appends a classifyer to the end of the file (like ls -F, --classify or --file-type) or that would display any of the data or size information (ls -l would be right out!). The only thing 'ls' could display without such a penalty is the file name. However that only apply to stock ls, and since we don't know what options might have been enabled for that 'ls' (including any default usage of switches such as those mentioned above), it's hard to say exactly what the problem is. A suggestion -- try installing a minimal snapshot of 'Cygwin' ('cygwin.org') and try env -i /bin/ls on cygwin's command line in that directory and see how fast that is. If it is slow, then something excessively weird is going on that is the wonder of a closed source Windows. However, my hunch would have it be 'fast', but since I don't know the cause, can't say if that would help or not. One further possibility that I'd think unlikely: the directory could be very fragmented and take a long time to (5minutes?! really unlikely, almost has to be the missing stat call) read...though the figures you are stating sound out of bounds for a fragmented directory. Still, if you grab the 'contig' tool from the sysinternals site (a windows subsite), it can show you the number of fragments a file is split into -- and can be used on directories: /prog/Sysinternals/cmd> contig -a -v . Contig v1.6 - Makes files contiguous Copyright (C) 1998-2010 Mark Russinovich Sysinternals - www.sysinternals.com Processing C:\prog\Sysinternals\cmd: Scanning file... Cluster: Length 0: 3 File size: 12288 bytes C:\prog\Sysinternals\cmd is in 1 fragment Summary: Number of files processed : 1 Average fragmentation : 1 frags/file Other than those options, not sure what else to suggest to narrow it down, but thought i'd at least mention a few possibilities. Good luck!
bug#35531: problem with ls in coreutils
Hi Although this bug report seems to be a problem with the windows port of ls, it reminded me of an interesting investigation into slow ls speeds due to colorizing via the LS_COLORS environment variable. See https://news.sherlock.stanford.edu/posts/when-setting-an-environment-variable-gives-you-a-40-x-speedup I thought it an interesting case study. Regards - PSDE
bug#35531: problem with ls in coreutils
On Friday, May 3, 2019 5:56:35 PM CEST Viktors Berstis wrote: > I don't think the problem has anything to do with sorting or -U1. It was unclear what you meant by "the problem" so I pointed out the only inefficiency that was immediately obvious to me. > When ls is taking over 5 minutes for something that should run in a > couple of seconds, the task manager shows that it is using nearly no > CPU it is doing a lot of "other I/O". You can try to use some profiling/tracing tools to debug the root cause. > It doesn't look like the build you referenced is designed to be > compileable for Windows. Is there one that is? Thanks. I would suggest to build the latest upstream release (coreutils-8.31 now) from: https://www.gnu.org/software/coreutils/ Kamil
bug#35531: problem with ls in coreutils
On 5/2/19 8:43 PM, Viktors Berstis wrote: > I downloaded it from > https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5.3.0.exe/download > > The help said "Report bugs to " which is what I > did. Whoever built it just copied that line from upstream. If the build has MS-Windows-specific problems, you'll need to find an MS-Windows person somewhere who can fix it, or find a better build somewhere. This bug-reporting system is not the best place to do that; see: https://www.gnu.org/prep/standards/html_node/System-Portability.html and look for "Windows".
bug#35531: problem with ls in coreutils
I don't think the problem has anything to do with sorting or -U1. When ls is taking over 5 minutes for something that should run in a couple of seconds, the task manager shows that it is using nearly no CPU it is doing a lot of "other I/O". It doesn't look like the build you referenced is designed to be compileable for Windows. Is there one that is? Thanks. [1]https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.5~49 - Viktors Berstis Kamil Dudka wrote: On Friday, May 3, 2019 5:43:20 AM CEST Viktors Berstis wrote: I downloaded it from [2]https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5. 3.0.exe/download The help said "Report bugs to [3]" which is what I did. The build is so old that I suspect none of the original players are around. Do you know of a windows binary or windows source that is newer anywhere? Thanks. - Viktors Berstis `ls -U1` will not run significantly faster than `ls` on powerful hardware. The key difference is that `ls -U1` prints the results continuously as the list of files is read from file system whereas `ls` will be silent until the complete list is read. You need to use a new enough version of coreutils for this to work properly. This optimisation was introduced in coreutils-7.5: [4]https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.0~113 [5]https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.5~49 Kamil Paul Eggert wrote: On 5/2/19 5:41 PM, Viktors Berstis wrote: The newer version of "ls" built for Windows has the problem. Ah, then you'll have to talk to whoever built that version, which is not me (and generally speaking they don't hang out on this mailing list). References 1. https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.5%7E49 2. https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5 3. mailto:bug-coreutils@gnu.org 4. https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.0~113 5. https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.5~49
bug#35531: problem with ls in coreutils
On Friday, May 3, 2019 5:43:20 AM CEST Viktors Berstis wrote: > I downloaded it from > https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5. > 3.0.exe/download The help said "Report bugs to " > which is what I did. The build is so old that I suspect none of the > original players are around. Do you know of a windows binary or windows > source that is newer > anywhere? Thanks. > > - Viktors Berstis `ls -U1` will not run significantly faster than `ls` on powerful hardware. The key difference is that `ls -U1` prints the results continuously as the list of files is read from file system whereas `ls` will be silent until the complete list is read. You need to use a new enough version of coreutils for this to work properly. This optimisation was introduced in coreutils-7.5: https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.0~113 https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.5~49 Kamil > Paul Eggert wrote: > > On 5/2/19 5:41 PM, Viktors Berstis wrote: > >> The newer version of "ls" built for Windows has the problem. > > > > Ah, then you'll have to talk to whoever built that version, which is not > > me (and generally speaking they don't hang out on this mailing list).
bug#35531: problem with ls in coreutils
I downloaded it from https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5.3.0.exe/download The help said "Report bugs to " which is what I did. The build is so old that I suspect none of the original players are around. Do you know of a windows binary or windows source that is newer anywhere? Thanks. - Viktors Berstis Paul Eggert wrote: On 5/2/19 5:41 PM, Viktors Berstis wrote: The newer version of "ls" built for Windows has the problem. Ah, then you'll have to talk to whoever built that version, which is not me (and generally speaking they don't hang out on this mailing list).
bug#35531: problem with ls in coreutils
On 5/2/19 5:41 PM, Viktors Berstis wrote: > The newer version of "ls" built for Windows has the problem. Ah, then you'll have to talk to whoever built that version, which is not me (and generally speaking they don't hang out on this mailing list).
bug#35531: problem with ls in coreutils
I am running coreutlls on Windows, not linux... so strace does not work there. The November 10, 1999 version 3.16 of coreutils "ls" command is lightning fast on Windows (and on the large directory) but unfortunately stops at 32K files. The newer version of "ls" built for Windows has the problem. By "new" version, I am using the 64 bit build for windows dated 4/20/2005 at 11:41AM with exe size of 180736 bytes, md5sum: 47ba770d80382cbd66ddba13924c1417 Version 5.3.0 . I didn't see a place to download a newer binary version to try. BTW, booting the machine with Ubuntu, ls on that same large directory is very fast. - Viktors Berstis Paul Eggert wrote: It's probably something inside the kernel (e.g., filesystem code). What does the shell command 'strace -o /tmp/tr -s 128 -T ls -U -1 dirname | wc' say? You can see which system calls are taking the most time by then running 'sort -t"<" -k2n /tmp/tr'. On my platform (Fedora 29 x86-64 ext4, an older desktop with only disk drives), the hoggiest syscalls are getdents64, which are as much as 24 ms per call when the data are not cached, and more like 0.7 ms per call when the data are cached (each such call retrieves about 1000 directory entries). What do you see?
bug#35531: problem with ls in coreutils
It's probably something inside the kernel (e.g., filesystem code). What does the shell command 'strace -o /tmp/tr -s 128 -T ls -U -1 dirname | wc' say? You can see which system calls are taking the most time by then running 'sort -t"<" -k2n /tmp/tr'. On my platform (Fedora 29 x86-64 ext4, an older desktop with only disk drives), the hoggiest syscalls are getdents64, which are as much as 24 ms per call when the data are not cached, and more like 0.7 ms per call when the data are cached (each such call retrieves about 1000 directory entries). What do you see?
bug#35531: problem with ls in coreutils
My machine has 64GB of ram, 6 core 3.5ghz processor and fast disks. The directory in question has 57,600 files in it with a total size of about 47gb. On a freshly booted machine (nothing cached), "dir /on dirname | wc" takes about 6 seconds. The second time it takes about 2 seconds. On a freshly booted machine, "ls -U -1 dirname | wc" takes 5 minutes 48 seconds! A second time it is about a minute less. ls might be doing something akin to opening every file. If I run a program to actually open and read every file in that directory, the system seems to cache it all in ram. Then the ls takes only about 11 seconds. - Viktors Berstis Kamil Dudka wrote: On Thursday, May 2, 2019 12:03:31 AM CEST Viktors Berstis wrote: When running "ls" or "ls -U" on a windows directory containing 5 files, ls takes forever. Something seems to be highly inefficient in there. Could you please try it with ls -U -1? Kamil This is for the 64 bit version build 4/20/2005 11:41AM. The exe size is 180736 bytes. Thanks. - Viktors Berstis
bug#35531: problem with ls in coreutils
On Thursday, May 2, 2019 12:03:31 AM CEST Viktors Berstis wrote: > When running "ls" or "ls -U" on a windows directory containing 5 > files, ls takes forever. Something seems to be highly inefficient in there. Could you please try it with ls -U -1? Kamil > This is for the 64 bit version build 4/20/2005 11:41AM. The exe size is > 180736 bytes. > > Thanks. > > - Viktors Berstis
bug#35531: problem with ls in coreutils
When running "ls" or "ls -U" on a windows directory containing 5 files, ls takes forever. Something seems to be highly inefficient in there. This is for the 64 bit version build 4/20/2005 11:41AM. The exe size is 180736 bytes. Thanks. - Viktors Berstis