On 2021-04-09 15:51, Pádraig Brady wrote:
On 09/04/2021 13:02, Carl Edquist wrote:
Dear Coreutils Maintainers,
I'd like to introduce my favorite 'ls' option, '-W', which I have been
enjoying using regularly over the last few years.
The concept is just to sort filenames by their printed widths.
(If this sounds odd, I invite you hear it out, try and see for
yourself!)
I am including a patch with my implementation and accompanying tests -
as
well as some sample output. And I'll happily field any requests for
improvements.
I quite like this. It seems useful.
Also doing outside of ls is quite awkward,
especially considering multi column output.
Ah, but not so!
What is awkward is doing the sorting outside of ls, using only
the shell and utilities.
The multi column output can be done by feeding the sorted list of
files to ls, with the -df options (don't list directories, don't sort).
Demo:
ls -f | gawk -f sizesort.awk
. buf.h time.c arith.h alloca.h y.output genman.txr
.. txr.h hash.h txr.vim struct.h lex.yy.c gencadr.txr
ID lib.c tree.c combi.c socket.h parser.h METALICENSE
tst buf.c glob.c arith.c parser.y filter.c reconfigure
win txr.c cadr.h chksums parser.l termios.h genvmop.txr
txr ffi.h eval.h sysif.c inst.nsi termios.c LICENSE-CYG
opt ftw.h glob.h regex.h chksum.c linenoise config.make
mpi jmp.S args.c debug.h signal.c configure sizesort.awk
tags ffi.c hash.c y.tab.h syslog.h protsym.c .gdb_history
pack lib.h utf8.c tags.tl stream.c lisplib.h checkman.txr
gc.c rand.c time.h regex.c unwind.c strudel.c genprotsym.txr
gc.h args.h combi.h INSTALL itypes.c strudel.h y.tab.c.shipped
vm.c rand.h debug.c match.c syslog.c gs_YEC3Hr y.tab.h.shipped
vm.h utf8.h HACKING unwind.h chksum.h optand.tl txr-manpage.pdf
.git cadr.c parser. filter.h signal.h gs_P4Z02S HACKING-toc.txr
txr.1 tl.vim sysif.h itypes.h RELNOTES gs_8aK1VJ lex.yy.c.shipped
share eval.c match.h struct.c parser.c gs_G7H2OA ChangeLog-2009-2015
tests tree.h LICENSE config.h socket.c lisplib.c
ftw.c vmop.h y.tab.c Makefile stream.h genvim.txr
Source code of sizesort.awk (which uses GNU Awk extensions):
#!/usr/bin/awk -f
function compare(ia, a, ib, b)
{
return length(a) - length(b)
}
{ dir[NR] = $0 }
END {
asort(dir, sdir, "compare")
for (x in sdir) {
print sdir[x] | "xargs ls -fd"
}
}
But this doesn't handle arbitrary file names. However, Awk can
process null terminated/separated records, as put out by find -print0:
Hold my beer:
$ find . -print0 | awk -v RS='\0' '{print$1}' | head
.
./rand.c
./args.h
./termios.h
./combi.h
./rand.h
./gencadr.txr
./unwind.h
./txr.1
./termios.c
Proof of concept with sorting:
$ find . -maxdepth 1 -print0 | awk -rf sizesort0.awk
. ./txr.c ./args.c ./regex.c ./chksum.h ./gs_G7H2OA
./ID ./ffi.h ./hash.c ./INSTALL ./signal.h ./lisplib.c
./tst ./ftw.h ./utf8.c ./match.c ./RELNOTES ./genvim.txr
./win ./jmp.S ./time.h ./unwind.h ./parser.c ./genman.txr
./txr ./ffi.c ./combi.h ./filter.h ./socket.c ./gencadr.txr
./opt ./lib.h ./debug.c ./itypes.h ./stream.h ./METALICENSE
./mpi ./rand.c ./HACKING ./struct.c ./y.output ./reconfigure
./tags ./args.h ./parser. ./config.h ./lex.yy.c ./genvmop.txr
./pack ./rand.h ./sysif.h ./Makefile ./parser.h ./LICENSE-CYG
./gc.c ./utf8.h ./match.h ./alloca.h ./filter.c ./config.make
./gc.h ./cadr.c ./LICENSE ./struct.h ./termios.h ./sizesort.awk
./vm.c ./tl.vim ./y.tab.c ./socket.h ./termios.c ./.gdb_history
./vm.h ./eval.c ./arith.h ./parser.y ./linenoise ./checkman.txr
./.git ./tree.h ./txr.vim ./parser.l ./configure ./sizesort0.awk
./txr.1 ./vmop.h ./combi.c ./inst.nsi ./protsym.c ./genprotsym.txr
./share ./time.c ./arith.c ./chksum.c ./lisplib.h ./y.tab.c.shipped
./tests ./hash.h ./chksums ./signal.c ./strudel.c ./y.tab.h.shipped
./ftw.c ./tree.c ./sysif.c ./syslog.h ./strudel.h ./txr-manpage.pdf
./buf.h ./glob.c ./regex.h ./stream.c ./gs_YEC3Hr ./HACKING-toc.txr
./txr.h ./cadr.h ./debug.h ./unwind.c ./optand.tl
./lex.yy.c.shipped
./lib.c ./eval.h ./y.tab.h ./itypes.c ./gs_P4Z02S
./ChangeLog-2009-2015
./buf.c ./glob.h ./tags.tl ./syslog.c ./gs_8aK1VJ
sizesort0.awk:
function compare(ia, a, ib, b)
{
return length(a) - length(b)
}
{ dir[NR] = $0 }
BEGIN {
RS = "\0"
}
END {
asort(dir, sdir, "compare")
for (x in sdir) {
printf "%s\0", sdir[x] | "xargs -0 ls -fd"
}
}
What we could use here is a "ls -0" option that is like "ls -1" but with
null termination. And likewise some option to have ls read file names
from standard input. Line-wise by default, or null-terminated if -0
is specified.
So easy in a language with more well-rounded functionality:
1> (run "ls" (cons "-fd" [sort (get-lines (open-directory ".")) < len]))
ID lib.c tree.c combi.c socket.h parser.h METALICENSE
tst buf.c glob.c arith.c parser.y filter.c reconfigure
win txr.c cadr.h chksums parser.l termios.h genvmop.txr
...
Ah!
This is totally clean, except for the threat of running into the
argument passing limit. Reading the directory is based on readdir
under the hood, and "ls" is getting clean argument strings.
What this tells us that we could use a ls-like utility which:
1. reads a directory into a list
2. applies some processing which the user can influence
3. executes an arbitary utility with the sorted list as arguments,
with additional arguments inserted at the front (like options
for the utility or the -- option).
GNU Awk could just have library functions added for this.
getdir(array, path) # populate associative array with directory
spawn(command, [arg, ]* array) # spawns program
E.g. if array holds { "a b", "c" } then spawn("ls", "-l", "--", array)
performs execlp("ls", "ls", "-l", "--", "a b", "c", (char *) NULL).
The command is parsed into arguments in some way, perhaps on spaces.
Cheers ...