Greetings Coreutils,

I'm submitting for your consideration here a patch to add the standard 
'--files0-from=FILE' option to ls.

(To read NUL-terminated names from FILE rather than using command line 
arguments.)


Motivation for adding this to ls is mainly to let ls sort arbitrarily many 
items, though it is also necessary for getting the correct aggregate 
column widths to align long format (-l) output across all file names.


As a real example, if you want to use ls to list (say, in long format with 
human sizes) all the sources in the linux kernel tree according to size, 
you might naively try one of

    [linux]$ find -name '*.[ch]' -exec ls -lrSh {} +
    [linux]$ find -name '*.[ch]' -print0 | xargs -0 ls -lrSh

but you'll see the sizes spiral over and over, finally ending somewhere in 
the middle:

...
-rw-r--r-- 1 kx users  81K Apr 15 04:30 ./arch/arm/boot/dts/imx6sll-pinfunc.h
-rw-r--r-- 1 kx users  83K Apr 15 04:30 ./arch/arm/boot/dts/imx6dl-pinfunc.h
-rw-r--r-- 1 kx users  87K Apr 15 04:30 ./arch/arm/mach-imx/iomux-mx35.h
-rw-r--r-- 1 kx users 107K Apr 15 04:30 ./arch/arm/boot/dts/imx7d-pinfunc.h
-rw-r--r-- 1 kx users 143K Apr 15 04:30 ./arch/arm/boot/dts/imx6sx-pinfunc.h


In this case, xargs batches *13* separate invocations of ls; so the 
overall sorting is completely lost.

But with the new option:

    [linux]$ find -name '*.[ch]' -print0 | ls -lrSh --files0-from=-

The sizes all scroll in order, finally ending in

...
-rw-r--r-- 1 kx users 5.0M Apr 15 04:31 
./drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_sh_mask.h
-rw-r--r-- 1 kx users 5.5M Apr 15 04:31 
./drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_1_0_sh_mask.h
-rw-r--r-- 1 kx users 6.6M Apr 15 04:31 
./drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_sh_mask.h
-rw-r--r-- 1 kx users  13M Apr 15 04:31 
./drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_0_sh_mask.h
-rw-r--r-- 1 kx users  14M Apr 15 04:31 
./drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_sh_mask.h


(The biggest files are where they belong, at the end of the listing.)


...

Similarly, say you would like to view / scroll through your extensive mp3 
collection in chronological order (based on when you added the files to 
your collection).  You can do it now with the new option:

    [music]$ find -name \*.mp3 -print0 | ls -lrth --files0-from=-


(Sidenote: Rob, the record store owner in High Fidelity (2000) [who was 
also known for his habit of making "Top 5" ordered lists], called this 
sorting of his records "autobiographical" [1].)

...


Additionally, note that ls can already list and sort an individual 
directory with arbitrarily many entries, but you run into trouble if you 
want to limit the output to a subset of those entries (eg, a particular 
glob pattern).

For instance if you have a system with many status files in a directory 
representing tasks, and want to list in chronological order the 
'completed' tasks, with something like:

    [tasks]$ ls -lrt *.completed

This will eventually fail, once the argument list limit is hit.

Again, a robust solution is possible with the new option:

    [tasks]$ find -mindepth 1 -maxdepth 1 -name \*.completed -printf '%f\0' |
             ls -lrt --files0-from=-


(The more complicated find expression is used here just to demonstrate how 
to match the behavior of the single-directory ls invocation with a glob 
pattern.)


That's about it.  The feature should already be well understood from other 
programs, but hopefully the examples demonstrate its utility within ls.


Any feedback / requests for improvement are of course welcome.


Carl


-=-=-+-=-=-


[1] https://youtu.be/AQvOnDlql5g
From f47996c749d7f155a10ca0fa8c1976821059ad50 Mon Sep 17 00:00:00 2001
From: Carl Edquist <[email protected]>
Date: Fri, 10 May 2019 17:05:47 -0500
Subject: [PATCH] ls: add --files0-from=FILE option

Useful for things like

    find [...] -type f -print0 | ls -lrSh --files0-from=-

where an attempt to do the same with 'find -print0 | xargs -0' or
'find -exec' would fail to fit all the filenames onto a single command
line, and thus ls would not sort all the input items together, nor
align the columns from -l across all items.

* src/ls.c (files_from): New var for input filename.
(long_options): Add new long option.
(read_files0): Add helper function to consume files0 input.
(main): Add logic for files_from handling.
(decode_switches): Handle FILES0_FROM_OPTION.
* tests/ls/files0-from-option.sh: Excercise new option.
* tests/local.mk: Include new test.
* doc/coreutils.texi: Document --files0-from=FILE option.
* NEWS: Mention the new feature.
---
 NEWS                           |  3 +++
 doc/coreutils.texi             |  2 ++
 src/ls.c                       | 61 +++++++++++++++++++++++++++++++++++++++---
 tests/local.mk                 |  1 +
 tests/ls/files0-from-option.sh | 40 +++++++++++++++++++++++++++
 5 files changed, 104 insertions(+), 3 deletions(-)
 create mode 100755 tests/ls/files0-from-option.sh

diff --git a/NEWS b/NEWS
index 090fbc7..7f7f86b 100644
--- a/NEWS
+++ b/NEWS
@@ -73,6 +73,9 @@ GNU coreutils NEWS                                    -*- outline -*-
   ls now accepts the --sort=width option, to sort by file name width.
   This is useful to more compactly organize the default vertical column output.
 
+  ls now accepts the --files0-from=FILE option, where FILE contains a
+  list of NUL-terminated file names.
+
   nl --line-increment can now take a negative number to decrement the count.
 
 ** Improvements
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 3e3aedb..e008746 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -7467,6 +7467,8 @@ option has been specified (@option{--classify} (@option{-F}),
 @option{--dereference} (@option{-L}), or
 @option{--dereference-command-line} (@option{-H})).
 
+@filesZeroFromOption{ls,,sorted output (and aligned columns in -l mode)}
+
 @item --group-directories-first
 @opindex --group-directories-first
 Group all the directories before the files and then sort the
diff --git a/src/ls.c b/src/ls.c
index 4586b5e..3c95590 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -102,6 +102,7 @@
 #include "mpsort.h"
 #include "obstack.h"
 #include "quote.h"
+#include "readtokens0.h"
 #include "smack.h"
 #include "stat-size.h"
 #include "stat-time.h"
@@ -710,6 +711,8 @@ static struct ignore_pattern *ignore_patterns;
    variable itself to be ignored.  */
 static struct ignore_pattern *hide_patterns;
 
+static char *files_from;
+
 /* True means output nongraphic chars in file names as '?'.
    (-q, --hide-control-chars)
    qmark_funny_chars and the quoting style (-Q, --quoting-style=WORD) are
@@ -833,6 +836,7 @@ enum
   COLOR_OPTION,
   DEREFERENCE_COMMAND_LINE_SYMLINK_TO_DIR_OPTION,
   FILE_TYPE_INDICATOR_OPTION,
+  FILES0_FROM_OPTION,
   FORMAT_OPTION,
   FULL_TIME_OPTION,
   GROUP_DIRECTORIES_FIRST_OPTION,
@@ -892,6 +896,7 @@ static struct option const long_options[] =
   {"block-size", required_argument, NULL, BLOCK_SIZE_OPTION},
   {"context", no_argument, 0, 'Z'},
   {"author", no_argument, NULL, AUTHOR_OPTION},
+  {"files0-from", required_argument, NULL, FILES0_FROM_OPTION},
   {GETOPT_HELP_OPTION_DECL},
   {GETOPT_VERSION_OPTION_DECL},
   {NULL, 0, NULL, 0}
@@ -1626,11 +1631,38 @@ signal_restore (void)
   signal_setup (false);
 }
 
+static void
+read_files0(struct Tokens *tokp)
+{
+  FILE *stream;
+  if (STREQ (files_from, "-"))
+    stream = stdin;
+  else
+    {
+      stream = fopen (files_from, "r");
+      if (stream == NULL)
+        die (EXIT_FAILURE, errno, _("cannot open %s for reading"),
+             quoteaf (files_from));
+    }
+
+  readtokens0_init (tokp);
+
+  if (! readtokens0 (stream, tokp) || fclose (stream) != 0)
+    die (LS_FAILURE, 0, _("cannot read file names from %s"),
+         quoteaf (files_from));
+
+  if (! tokp->n_tok)
+    die (LS_FAILURE, 0, _("no input from %s"),
+         quoteaf (files_from));
+}
+
 int
 main (int argc, char **argv)
 {
   int i;
   struct pending *thispend;
+  struct Tokens tok;
+  char **files;
   int n_files;
 
   initialize_main (&argc, &argv);
@@ -1727,6 +1759,23 @@ main (int argc, char **argv)
   clear_files ();
 
   n_files = argc - i;
+  files = argv + i;
+
+  if (files_from)
+    {
+      if (n_files > 0)
+        {
+          error (0, 0, _("extra operand %s"), quoteaf (argv[i]));
+          fprintf (stderr, "%s\n",
+                   _("file operands cannot be combined with --files0-from"));
+          usage (LS_FAILURE);
+        }
+
+      read_files0(&tok);
+
+      files = tok.tok;
+      n_files = tok.n_tok;
+    }
 
   if (n_files <= 0)
     {
@@ -1736,9 +1785,8 @@ main (int argc, char **argv)
         queue_directory (".", NULL, true);
     }
   else
-    do
-      gobble_file (argv[i++], unknown, NOT_AN_INODE_NUMBER, true, "");
-    while (i < argc);
+    for (i = 0; i < n_files; i++)
+      gobble_file (files[i], unknown, NOT_AN_INODE_NUMBER, true, "");
 
   if (cwd_n_used)
     {
@@ -1833,6 +1881,9 @@ main (int argc, char **argv)
       hash_free (active_dir_set);
     }
 
+  if (files_from)
+    readtokens0_free (&tok);
+
   return exit_status;
 }
 
@@ -2201,6 +2252,10 @@ decode_switches (int argc, char **argv)
           time_type = XARGMATCH ("--time", optarg, time_args, time_types);
           break;
 
+        case FILES0_FROM_OPTION:
+          files_from = optarg;
+          break;
+
         case FORMAT_OPTION:
           format = XARGMATCH ("--format", optarg, format_args, format_types);
           break;
diff --git a/tests/local.mk b/tests/local.mk
index a44feca..385cef0 100644
--- a/tests/local.mk
+++ b/tests/local.mk
@@ -605,6 +605,7 @@ all_tests =					\
   tests/ls/dangle.sh				\
   tests/ls/dired.sh				\
   tests/ls/file-type.sh				\
+  tests/ls/files0-from-option.sh                \
   tests/ls/follow-slink.sh			\
   tests/ls/getxattr-speedup.sh			\
   tests/ls/group-dirs.sh			\
diff --git a/tests/ls/files0-from-option.sh b/tests/ls/files0-from-option.sh
new file mode 100755
index 0000000..03c987c
--- /dev/null
+++ b/tests/ls/files0-from-option.sh
@@ -0,0 +1,40 @@
+#!/bin/sh
+# Exercise the --files0-from option.
+
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <https://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
+print_ver_ ls
+
+touch abc || framework_failure_
+touch def || framework_failure_
+touch xyz || framework_failure_
+
+printf '%s\0' abc def xyz > names || framework_failure_
+
+ls --files0-from=names         > out1 || fail=1
+cat names | ls --files0-from=- > out2 || fail=1
+
+cat <<\EOF > exp || framework_failure_
+abc
+def
+xyz
+EOF
+
+compare exp out1 || fail=1
+compare exp out2 || fail=1
+
+Exit $fail
-- 
2.9.0

Reply via email to