On 2021-04-09 15:51, Pádraig Brady wrote:
On 09/04/2021 13:02, Carl Edquist wrote:
Dear Coreutils Maintainers,

I'd like to introduce my favorite 'ls' option, '-W', which I have been
enjoying using regularly over the last few years.

The concept is just to sort filenames by their printed widths.


(If this sounds odd, I invite you hear it out, try and see for yourself!)


I am including a patch with my implementation and accompanying tests - as
well as some sample output.  And I'll happily field any requests for
improvements.

I quite like this. It seems useful.
Also doing outside of ls is quite awkward,
especially considering multi column output.

Ah, but not so!

What is awkward is doing the sorting outside of ls, using only
the shell and utilities.

The multi column output can be done by feeding the sorted list of
files to ls, with the -df options (don't list directories, don't sort).

Demo:

ls -f | gawk -f sizesort.awk
.      buf.h   time.c   arith.h   alloca.h  y.output    genman.txr
..     txr.h   hash.h   txr.vim   struct.h  lex.yy.c    gencadr.txr
ID     lib.c   tree.c   combi.c   socket.h  parser.h    METALICENSE
tst    buf.c   glob.c   arith.c   parser.y  filter.c    reconfigure
win    txr.c   cadr.h   chksums   parser.l  termios.h   genvmop.txr
txr    ffi.h   eval.h   sysif.c   inst.nsi  termios.c   LICENSE-CYG
opt    ftw.h   glob.h   regex.h   chksum.c  linenoise   config.make
mpi    jmp.S   args.c   debug.h   signal.c  configure   sizesort.awk
tags   ffi.c   hash.c   y.tab.h   syslog.h  protsym.c   .gdb_history
pack   lib.h   utf8.c   tags.tl   stream.c  lisplib.h   checkman.txr
gc.c   rand.c  time.h   regex.c   unwind.c  strudel.c   genprotsym.txr
gc.h   args.h  combi.h  INSTALL   itypes.c  strudel.h   y.tab.c.shipped
vm.c   rand.h  debug.c  match.c   syslog.c  gs_YEC3Hr   y.tab.h.shipped
vm.h   utf8.h  HACKING  unwind.h  chksum.h  optand.tl   txr-manpage.pdf
.git   cadr.c  parser.  filter.h  signal.h  gs_P4Z02S   HACKING-toc.txr
txr.1  tl.vim  sysif.h  itypes.h  RELNOTES  gs_8aK1VJ   lex.yy.c.shipped
share  eval.c  match.h  struct.c  parser.c  gs_G7H2OA   ChangeLog-2009-2015
tests  tree.h  LICENSE  config.h  socket.c  lisplib.c
ftw.c  vmop.h  y.tab.c  Makefile  stream.h  genvim.txr

Source code of sizesort.awk (which uses GNU Awk extensions):

#!/usr/bin/awk -f

function compare(ia, a, ib, b)
{
   return length(a) - length(b)
}

{ dir[NR] = $0 }

END {
  asort(dir, sdir, "compare")
  for (x in sdir) {
     print sdir[x] | "xargs ls -fd"
  }
}

But this doesn't handle arbitrary file names. However, Awk can
process null terminated/separated records, as put out by find -print0:

Hold my beer:

$ find . -print0 | awk -v RS='\0' '{print$1}' | head
.
./rand.c
./args.h
./termios.h
./combi.h
./rand.h
./gencadr.txr
./unwind.h
./txr.1
./termios.c

Proof of concept with sorting:

$ find . -maxdepth 1 -print0 | awk -rf sizesort0.awk
.        ./txr.c   ./args.c   ./regex.c   ./chksum.h   ./gs_G7H2OA
./ID     ./ffi.h   ./hash.c   ./INSTALL   ./signal.h   ./lisplib.c
./tst    ./ftw.h   ./utf8.c   ./match.c   ./RELNOTES   ./genvim.txr
./win    ./jmp.S   ./time.h   ./unwind.h  ./parser.c   ./genman.txr
./txr    ./ffi.c   ./combi.h  ./filter.h  ./socket.c   ./gencadr.txr
./opt    ./lib.h   ./debug.c  ./itypes.h  ./stream.h   ./METALICENSE
./mpi    ./rand.c  ./HACKING  ./struct.c  ./y.output   ./reconfigure
./tags   ./args.h  ./parser.  ./config.h  ./lex.yy.c   ./genvmop.txr
./pack   ./rand.h  ./sysif.h  ./Makefile  ./parser.h   ./LICENSE-CYG
./gc.c   ./utf8.h  ./match.h  ./alloca.h  ./filter.c   ./config.make
./gc.h   ./cadr.c  ./LICENSE  ./struct.h  ./termios.h  ./sizesort.awk
./vm.c   ./tl.vim  ./y.tab.c  ./socket.h  ./termios.c  ./.gdb_history
./vm.h   ./eval.c  ./arith.h  ./parser.y  ./linenoise  ./checkman.txr
./.git   ./tree.h  ./txr.vim  ./parser.l  ./configure  ./sizesort0.awk
./txr.1  ./vmop.h  ./combi.c  ./inst.nsi  ./protsym.c  ./genprotsym.txr
./share  ./time.c  ./arith.c  ./chksum.c  ./lisplib.h  ./y.tab.c.shipped
./tests  ./hash.h  ./chksums  ./signal.c  ./strudel.c  ./y.tab.h.shipped
./ftw.c  ./tree.c  ./sysif.c  ./syslog.h  ./strudel.h  ./txr-manpage.pdf
./buf.h  ./glob.c  ./regex.h  ./stream.c  ./gs_YEC3Hr  ./HACKING-toc.txr
./txr.h ./cadr.h ./debug.h ./unwind.c ./optand.tl ./lex.yy.c.shipped ./lib.c ./eval.h ./y.tab.h ./itypes.c ./gs_P4Z02S ./ChangeLog-2009-2015
./buf.c  ./glob.h  ./tags.tl  ./syslog.c  ./gs_8aK1VJ

sizesort0.awk:

function compare(ia, a, ib, b)
{
   return length(a) - length(b)
}

{ dir[NR] = $0 }

BEGIN {
  RS = "\0"
}

END {
  asort(dir, sdir, "compare")
  for (x in sdir) {
     printf "%s\0", sdir[x] | "xargs -0 ls -fd"
  }
}

What we could use here is a "ls -0" option that is like "ls -1" but with
null termination. And likewise some option to have ls read file names
from standard input. Line-wise by default, or null-terminated if -0
is specified.

So easy in a language with more well-rounded functionality:

1> (run "ls" (cons "-fd" [sort (get-lines (open-directory ".")) < len]))
ID     lib.c   tree.c   combi.c   socket.h  parser.h     METALICENSE
tst    buf.c   glob.c   arith.c   parser.y  filter.c     reconfigure
win    txr.c   cadr.h   chksums   parser.l  termios.h    genvmop.txr
...

Ah!

This is totally clean, except for the threat of running into the
argument passing limit. Reading the directory is based on readdir
under the hood, and "ls" is getting clean argument strings.

What this tells us that we could use a ls-like utility which:

1. reads a directory into a list
2. applies some processing which the user can influence
3. executes an arbitary utility with the sorted list as arguments,
   with additional arguments inserted at the front (like options
   for the utility or the -- option).

GNU Awk could just have library functions added for this.

   getdir(array, path)     # populate associative array with directory

   spawn(command, [arg, ]* array) # spawns program

E.g. if array holds { "a b", "c" } then spawn("ls", "-l", "--", array)
performs execlp("ls", "ls", "-l", "--", "a b", "c", (char *) NULL).

The command is parsed into arguments in some way, perhaps on spaces.


Cheers ...

Reply via email to