bug#35531: problem with ls in coreutils

2019-05-10 Thread Pádraig Brady
tag 35531 notabug
close 35531
stop

On 03/05/19 17:01, Peter Edwards wrote:
> Hi
> 
> Although this bug report seems to be a problem with the windows port
> of ls, it reminded me of an interesting investigation into slow ls
> speeds due to colorizing via the LS_COLORS environment variable.
> 
> See 
> https://news.sherlock.stanford.edu/posts/when-setting-an-environment-variable-gives-you-a-40-x-speedup
> 
> I thought it an interesting case study.

Thanks for the info.

In summary, to speed up ls color induced processing significantly,
disable stat() and getxattr() calls with:
  LS_COLORS='ex=00:su=00:sg=00:ca=00:'

A general point though is that colors are for human processing,
and how fast can one process the output from ls :)
I.E. if ls is being written to pipe/file or somewhere where
speed may be important, the coloring is disabled by default anyway.

cheers,
Pádraig





bug#35531: problem with ls in coreutils

2019-05-03 Thread L A Walsh
On 5/1/2019 3:03 PM, Viktors Berstis wrote:
> When running "ls" or "ls -U" on a windows directory containing 5 
> files, ls takes forever.  Something seems to be highly inefficient in there.
>   
---
it sounds like you are running ls with no options
(nothing in environment and no switches on the command line).

Is this the case?  If is, I'm stumped unless whoever
compiled that had it set to do some things by default.

Basically on Windows, anything that you might get away with on
linux with a stat call, takes an 'open' call on windows.   That gets
costly.  Anything that appends a classifyer to the end of the file
(like ls -F, --classify or --file-type) or that would display any
of the data or size information (ls -l would be right out!).  The
only thing 'ls' could display without such a penalty is the file
name.  However that only apply to stock ls, and since we don't know
what options might have been enabled for that 'ls' (including any
default usage of switches such as those mentioned above), it's
hard to say exactly what the problem is.

A suggestion -- try installing a minimal snapshot of 'Cygwin'
('cygwin.org') and try env -i /bin/ls on cygwin's command line
in that directory and see how fast that is.  If it is slow,  then
something excessively weird is going on that is the wonder of a closed
source Windows.  However, my hunch would have it be 'fast', but since
I don't know the cause, can't say if that would help or not.

One further possibility that I'd think unlikely: the directory could
be very fragmented and take a long time to (5minutes?! really unlikely,
almost has to be the missing stat call) read...though the figures
you are stating sound out of bounds for a fragmented directory.
Still, if you grab the 'contig' tool from the sysinternals site (a
windows subsite), it can show you the number of fragments a file
is split into -- and can be used on directories:
/prog/Sysinternals/cmd> contig -a -v .

Contig v1.6 - Makes files contiguous
Copyright (C) 1998-2010 Mark Russinovich
Sysinternals - www.sysinternals.com

Processing C:\prog\Sysinternals\cmd:
Scanning file...
Cluster: Length
0: 3
File size: 12288 bytes
C:\prog\Sysinternals\cmd is in 1 fragment

Summary:
 Number of files processed   : 1
 Average fragmentation   : 1 frags/file



Other than those options, not sure what else to suggest to narrow
it down, but thought i'd at least mention a few possibilities.

Good luck!






bug#35531: problem with ls in coreutils

2019-05-03 Thread Peter Edwards

Hi

Although this bug report seems to be a problem with the windows port
of ls, it reminded me of an interesting investigation into slow ls
speeds due to colorizing via the LS_COLORS environment variable.

See 
https://news.sherlock.stanford.edu/posts/when-setting-an-environment-variable-gives-you-a-40-x-speedup


I thought it an interesting case study.

Regards - PSDE





bug#35531: problem with ls in coreutils

2019-05-03 Thread Kamil Dudka
On Friday, May 3, 2019 5:56:35 PM CEST Viktors Berstis wrote:
> I don't think the problem has anything to do with sorting or -U1.

It was unclear what you meant by "the problem" so I pointed out the only 
inefficiency that was immediately obvious to me.

> When ls is taking over 5 minutes for something that should run in a
> couple of seconds, the task manager shows that it is using nearly no
> CPU it is doing a lot of  "other I/O".

You can try to use some profiling/tracing tools to debug the root cause.

> It doesn't look like the build you referenced is designed to be
> compileable for Windows.  Is there one that is?  Thanks.

I would suggest to build the latest upstream release (coreutils-8.31 now) 
from:

https://www.gnu.org/software/coreutils/

Kamil







bug#35531: problem with ls in coreutils

2019-05-03 Thread Paul Eggert
On 5/2/19 8:43 PM, Viktors Berstis wrote:
> I downloaded it from
> https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5.3.0.exe/download
>
> The help said "Report bugs to " which is what I
> did. 

Whoever built it just copied that line from upstream. If the build has
MS-Windows-specific problems, you'll need to find an MS-Windows person
somewhere who can fix it, or find a better build somewhere. This
bug-reporting system is not the best place to do that; see:

https://www.gnu.org/prep/standards/html_node/System-Portability.html

and look for "Windows".






bug#35531: problem with ls in coreutils

2019-05-03 Thread Viktors Berstis
   I don't think the problem has anything to do with sorting or -U1.  When
   ls is taking over 5 minutes for something that should run in a couple
   of seconds, the task manager shows that it is using nearly no CPU
   it is doing a lot of  "other I/O".
   It doesn't look like the build you referenced is designed to be
   compileable for Windows.  Is there one that is?  Thanks.
[1]https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.5~49


- Viktors Berstis

   Kamil Dudka wrote:

On Friday, May 3, 2019 5:43:20 AM CEST Viktors Berstis wrote:

I downloaded it from
[2]https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5.
3.0.exe/download The help said "Report bugs to [3]"
which is what I did. The build is so old that I suspect none of the
original players are around. Do you know of a windows binary or windows
source that is newer
anywhere?  Thanks.

- Viktors Berstis

`ls -U1` will not run significantly faster than `ls` on powerful hardware.
The key difference is that `ls -U1` prints the results continuously as the
list of files is read from file system whereas `ls` will be silent until
the complete list is read.  You need to use a new enough version of coreutils
for this to work properly.  This optimisation was introduced in coreutils-7.5:

[4]https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.0~113
[5]https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.5~49

Kamil


Paul Eggert wrote:

On 5/2/19 5:41 PM, Viktors Berstis wrote:

The newer version of "ls" built for Windows has the problem.

Ah, then you'll have to talk to whoever built that version, which is not
me (and generally speaking they don't hang out on this mailing list).

References

   1. https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.5%7E49
   2. 
https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5
   3. mailto:bug-coreutils@gnu.org
   4. https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.0~113
   5. https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.5~49


bug#35531: problem with ls in coreutils

2019-05-03 Thread Kamil Dudka
On Friday, May 3, 2019 5:43:20 AM CEST Viktors Berstis wrote:
> I downloaded it from
> https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5.
> 3.0.exe/download The help said "Report bugs to "
> which is what I did. The build is so old that I suspect none of the
> original players are around. Do you know of a windows binary or windows
> source that is newer
> anywhere?  Thanks.
> 
> - Viktors Berstis

`ls -U1` will not run significantly faster than `ls` on powerful hardware.  
The key difference is that `ls -U1` prints the results continuously as the 
list of files is read from file system whereas `ls` will be silent until
the complete list is read.  You need to use a new enough version of coreutils 
for this to work properly.  This optimisation was introduced in coreutils-7.5:

https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.0~113
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.5~49

Kamil

> Paul Eggert wrote:
> > On 5/2/19 5:41 PM, Viktors Berstis wrote:
> >> The newer version of "ls" built for Windows has the problem.
> > 
> > Ah, then you'll have to talk to whoever built that version, which is not
> > me (and generally speaking they don't hang out on this mailing list).







bug#35531: problem with ls in coreutils

2019-05-02 Thread Viktors Berstis
I downloaded it from 
https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5.3.0.exe/download

The help said "Report bugs to " which is what I did.
The build is so old that I suspect none of the original players are around.
Do you know of a windows binary or windows source that is newer 
anywhere?  Thanks.


- Viktors Berstis

Paul Eggert wrote:

On 5/2/19 5:41 PM, Viktors Berstis wrote:

The newer version of "ls" built for Windows has the problem.

Ah, then you'll have to talk to whoever built that version, which is not
me (and generally speaking they don't hang out on this mailing list).









bug#35531: problem with ls in coreutils

2019-05-02 Thread Paul Eggert
On 5/2/19 5:41 PM, Viktors Berstis wrote:
> The newer version of "ls" built for Windows has the problem.

Ah, then you'll have to talk to whoever built that version, which is not
me (and generally speaking they don't hang out on this mailing list).






bug#35531: problem with ls in coreutils

2019-05-02 Thread Viktors Berstis
I am running coreutlls on Windows, not linux... so strace does not work 
there.


The November 10, 1999 version 3.16 of coreutils "ls" command is 
lightning fast on Windows (and on the large directory) but unfortunately 
stops at 32K files.  The newer version of "ls" built for Windows has the 
problem.
By "new" version, I am using the 64 bit build for windows dated 
4/20/2005 at 11:41AM with exe size of 180736 bytes, md5sum: 
47ba770d80382cbd66ddba13924c1417  Version 5.3.0  .  I didn't see a place 
to download a newer binary version to try.


BTW, booting the machine with Ubuntu, ls on that same large directory is 
very fast.


- Viktors Berstis

Paul Eggert wrote:

It's probably something inside the kernel (e.g., filesystem code).

What does the shell command 'strace -o /tmp/tr -s 128 -T ls -U -1
dirname | wc' say? You can see which system calls are taking the most
time by then running 'sort -t"<" -k2n /tmp/tr'. On my platform (Fedora
29 x86-64 ext4, an older desktop with only disk drives), the hoggiest
syscalls are getdents64, which are as much as 24 ms per call when the
data are not cached, and more like 0.7 ms per call when the data are
cached (each such call retrieves about 1000 directory entries). What do
you see?









bug#35531: problem with ls in coreutils

2019-05-02 Thread Paul Eggert
It's probably something inside the kernel (e.g., filesystem code).

What does the shell command 'strace -o /tmp/tr -s 128 -T ls -U -1
dirname | wc' say? You can see which system calls are taking the most
time by then running 'sort -t"<" -k2n /tmp/tr'. On my platform (Fedora
29 x86-64 ext4, an older desktop with only disk drives), the hoggiest
syscalls are getdents64, which are as much as 24 ms per call when the
data are not cached, and more like 0.7 ms per call when the data are
cached (each such call retrieves about 1000 directory entries). What do
you see?






bug#35531: problem with ls in coreutils

2019-05-02 Thread Viktors Berstis

My machine has 64GB of ram, 6 core 3.5ghz processor and fast disks.
The directory in question has 57,600 files in it with a total size of 
about 47gb.
On a freshly booted machine (nothing cached),  "dir /on dirname | wc" 
takes about 6 seconds.  The second time it takes about 2 seconds.
On a freshly booted machine, "ls -U -1 dirname | wc" takes 5 minutes 48 
seconds!  A second time it is about a minute less.
ls might be doing something akin to opening every file.  If I run a 
program to actually open and read every file in that directory, the 
system seems to cache it all in ram.  Then the ls takes only about 11 
seconds.


- Viktors Berstis

Kamil Dudka wrote:

On Thursday, May 2, 2019 12:03:31 AM CEST Viktors Berstis wrote:

When running "ls" or "ls -U" on a windows directory containing 5
files, ls takes forever.  Something seems to be highly inefficient in there.

Could you please try it with ls -U -1?

Kamil


This is for the 64 bit version build 4/20/2005 11:41AM.  The exe size is
180736 bytes.

Thanks.

- Viktors Berstis








bug#35531: problem with ls in coreutils

2019-05-01 Thread Kamil Dudka
On Thursday, May 2, 2019 12:03:31 AM CEST Viktors Berstis wrote:
> When running "ls" or "ls -U" on a windows directory containing 5
> files, ls takes forever.  Something seems to be highly inefficient in there.

Could you please try it with ls -U -1?

Kamil

> This is for the 64 bit version build 4/20/2005 11:41AM.  The exe size is
> 180736 bytes.
> 
> Thanks.
> 
> - Viktors Berstis







bug#35531: problem with ls in coreutils

2019-05-01 Thread Viktors Berstis
When running "ls" or "ls -U" on a windows directory containing 5 
files, ls takes forever.  Something seems to be highly inefficient in there.


This is for the 64 bit version build 4/20/2005 11:41AM.  The exe size is 
180736 bytes.


Thanks.

- Viktors Berstis