Re: /bin/ls print format bugs

2023-10-05 Thread Crystal Kolipe
Hi Ingo,

On Thu, Oct 05, 2023 at 05:32:07PM +0200, Ingo Schwarze wrote:
> In general, ls(1) strives to dynamically determine the required
> column width.  It already does that for the file size column.
> Given that device numbers use the same column, i think the solution
> that is cleanest, most robust, most aesthetically pleasing, least
> wasteful with respect to space, and easiest to maintain is simply
> doing the same for device numbers, see the patch below.

This looks good and works fine for me.



Re: /bin/ls print format bugs

2023-10-05 Thread Ingo Schwarze
Hi Crystal,

Crystal Kolipe wrote on Tue, Oct 03, 2023 at 06:47:42PM -0300:

> Two display format bugs seems to have crept in to ls due to the
> addition of scaled human readable values and large minor numbers.

I think you are right that the current situation isn't good.
Thank you for bringing this up.

In general, ls(1) strives to dynamically determine the required
column width.  It already does that for the file size column.
Given that device numbers use the same column, i think the solution
that is cleanest, most robust, most aesthetically pleasing, least
wasteful with respect to space, and easiest to maintain is simply
doing the same for device numbers, see the patch below.

Two remarks:
 * The reason for keeping the local variable "bcfile" is that
   (0, 0) represents a valid device (console).  There is no
   good reason why it should be in the DISPLAY struct, though.
 * The line "d.s_major = d.s_minor = 3;" is likely redundant because
   with bcfile == 0, nothing is supposed to use these fields.
   While local variables should only be initialized when needed, i
   think defensive programming recommends keeping data that is used
   by (even internal) APIs initialized to reasonable values rather
   than having random memory content lying around.  That reduces
   the risk of future code changes in *other* modules accidentally
   accessing random data, if developers get confused regarding
   which invariants the internal API guarantees or requires.

OK?
  Ingo


   $ cd /dev
   $ ls -l MAKEDEV *vid*  
  -r-xr-xr-x  1 root  wheel  12254 Oct  3 07:07 MAKEDEV
  lrwxr-xr-x  1 root  wheel  6 Aug 29  2016 video -> video0
  crw---  1 root  wheel  44, 0 Oct  3 13:34 video0
  crw---  1 root  wheel  44, 1 Oct  3 13:34 video1
   $ ls -lh MAKEDEV *vid*
  -r-xr-xr-x  1 root  wheel  12.0K Oct  3 07:07 MAKEDEV
  lrwxr-xr-x  1 root  wheel 6B Aug 29  2016 video -> video0
  crw---  1 root  wheel  44, 0 Oct  3 13:34 video0
  crw---  1 root  wheel  44, 1 Oct  3 13:34 video1
   $ ls -l null
  crw-rw-rw-  1 root  wheel  2, 2 Oct  5 00:32 null
   $ ls -l cua00  
  crw-rw  1 root  dialer  8, 128 Oct  3 13:34 cua00


Index: ls.c
===
RCS file: /cvs/src/bin/ls/ls.c,v
retrieving revision 1.54
diff -u -p -r1.54 ls.c
--- ls.c7 Oct 2020 21:03:09 -   1.54
+++ ls.c5 Oct 2023 14:47:37 -
@@ -436,6 +436,7 @@ display(FTSENT *p, FTSENT *list)
unsigned long long btotal;
blkcnt_t maxblock;
ino_t maxinode;
+   unsigned int maxmajor, maxminor;
int bcfile, flen, glen, ulen, maxflags, maxgroup, maxuser, maxlen;
int entries, needstats;
int width;
@@ -449,6 +450,7 @@ display(FTSENT *p, FTSENT *list)
btotal = maxblock = maxinode = maxlen = maxnlink = 0;
bcfile = 0;
maxuser = maxgroup = maxflags = 0;
+   maxmajor = maxminor = 0;
maxsize = 0;
for (cur = list, entries = 0; cur != NULL; cur = cur->fts_link) {
if (cur->fts_info == FTS_ERR || cur->fts_info == FTS_NS) {
@@ -523,9 +525,13 @@ display(FTSENT *p, FTSENT *list)
(void)strlcpy(np->group, group, glen + 1);
 
if (S_ISCHR(sp->st_mode) ||
-   S_ISBLK(sp->st_mode))
+   S_ISBLK(sp->st_mode)) {
bcfile = 1;
-
+   if (maxmajor < major(sp->st_rdev))
+   maxmajor = major(sp->st_rdev);
+   if (maxminor < minor(sp->st_rdev))
+   maxminor = minor(sp->st_rdev);
+   }
if (f_flags) {
np->flags = >data[ulen + 1 + glen + 
1];
(void)strlcpy(np->flags, flags, flen + 
1);
@@ -551,7 +557,6 @@ display(FTSENT *p, FTSENT *list)
d.entries = entries;
d.maxlen = maxlen;
if (needstats) {
-   d.bcfile = bcfile;
d.btotal = btotal;
(void)snprintf(buf, sizeof(buf), "%llu",
(unsigned long long)maxblock);
@@ -570,6 +575,17 @@ display(FTSENT *p, FTSENT *list)
d.s_size = strlen(buf);
} else
d.s_size = FMT_SCALED_STRSIZE-2; /* no - or '\0' */
+   d.s_major = d.s_minor = 3;
+   if (bcfile) {
+   (void)snprintf(buf, sizeof(buf), "%u", maxmajor);
+   d.s_major = strlen(buf);
+   (void)snprintf(buf, sizeof(buf), "%u", maxminor);
+   d.s_minor = strlen(buf);
+   if (d.s_size <= d.s_major + 2 + d.s_minor)
+   

/bin/ls print format bugs

2023-10-03 Thread Crystal Kolipe
Two display format bugs seems to have crept in to ls due to the addition of
scaled human readable values and large minor numbers.

A provisional patch is attached, but an aesthetic decision about column padding
needs to be made, (8 vs 13 columns to handle large minor numbers cleanly, see
further below).

Bug 1:
When using ls -lh, (long format, human-readable sizes), if the list of files
includes a mixture of regular files and device nodes then the sizes of the
regular files are displayed in bytes instead of using multipliers.

# ls -lh
total 672
-rw-r--r--  1 root  wheel 100 Oct  3 16:41 a
-rw-r--r--  1 root  wheel 1000 Oct  3 16:41 b
-rw-r--r--  1 root  wheel 1 Oct  3 16:41 c
-rw-r--r--  1 root  wheel 10 Oct  3 16:41 d
crw-r--r--  1 root  wheel0,   0 Oct  3 16:42 dev1
crw-r--r--  1 root  wheel  255, 255 Oct  3 16:42 dev2
crw-r--r--  1 root  wheel  255, 65535 Oct  3 16:42 dev3
crw-r--r--  1 root  wheel  255, 16777214 Oct  3 16:43 dev4
-rw-r--r--  1 root  wheel 100 Oct  3 16:41 e
-rw-r--r--  1 root  wheel 1000 Oct  3 16:41 f
-rw-r--r--  1 root  wheel 1 Oct  3 16:41 g


Bug 2:
When using ls -l, (long format only), columns are padded with too many
additional spaces when values exceed 8 characters, due to a printf format
string argument turning negative.

Compare the following two outputs:

-rw-r--r--  1 root  wheel   100 Oct  3 16:41 a
-rw-r--r--  1 root  wheel  1000 Oct  3 16:41 b
crw-r--r--  1 root  wheel0,   0 Oct  3 16:42 dev1
crw-r--r--  1 root  wheel  255, 255 Oct  3 16:42 dev2
-rw-r--r--  1 root  wheel 0 Oct  3 17:39 h
-rw-r--r--  1 root  wheel10 Oct  3 17:39 i
-rw-r--r--  1 root  wheel   100 Oct  3 17:39 j
-rw-r--r--  1 root  wheel  1000 Oct  3 17:39 k
-rw-r--r--  1 root  wheel 1 Oct  3 17:39 l
-rw-r--r--  1 root  wheel10 Oct  3 17:39 m

-rw-r--r--  1 root  wheel 100 Oct  3 16:41 a
-rw-r--r--  1 root  wheel1000 Oct  3 16:41 b
-rw-r--r--  1 root  wheel   1 Oct  3 16:41 c
-rw-r--r--  1 root  wheel  10 Oct  3 16:41 d
crw-r--r--  1 root  wheel0,   0 Oct  3 16:42 dev1
crw-r--r--  1 root  wheel  255, 255 Oct  3 16:42 dev2
-rw-r--r--  1 root  wheel 100 Oct  3 16:41 e
-rw-r--r--  1 root  wheel   0 Oct  3 17:39 h
-rw-r--r--  1 root  wheel  10 Oct  3 17:39 i
-rw-r--r--  1 root  wheel 100 Oct  3 17:39 j
-rw-r--r--  1 root  wheel1000 Oct  3 17:39 k
-rw-r--r--  1 root  wheel   1 Oct  3 17:39 l
-rw-r--r--  1 root  wheel  10 Oct  3 17:39 m

Observe that in the first output, file b has the largest size and the
leftmost digit aligns with the leftmost digit of 255, 255.

however, in the second output, file e is the largest and has three space
characters too many inserted in front.

This second bug is caused by the following code in printlong() in print.c:

(void)printf("%*s%*lld ",
8 - dp->s_size, "", dp->s_size,
(long long)sp->st_size);

The intention is clearly to align the last digits of the output when a device
file is included in the output, (dp->bcfile tests true).

Device files are printed with "%3u, %3u ", so would traditionally have been
limited to 8 characters plus the trailing space.  Large minor numbers have
obviously changed this assumption and so this field can now expand to be
larger.

However, even without large minors we have a problem because dp->s_size can
be > 8, and in that case we are passing a _negative_ value to printf for the
field width.  Since printf interprets negative values here as their positive
equivalent, we get double spaces once dp->s_size > 8.

This presents an aesthetic decision, because to ensure a large enough column
width for the largest possible minor numbers we need to use:

printf ("%3u, %8u ")

... despite most people not having minor numbers > 255 in /dev, so the display
could be considered a bit ugly.

Possibly the simplest fix for this second issue would be just to expand the
minor number field:

--- print.c.distMon Oct  2 21:03:01 2023
+++ print.c Tue Oct  3 18:22:54 2023
@@ -110,11 +110,11 @@
if (f_flags)
(void)printf("%-*s ", dp->s_flags, np->flags);
if (S_ISCHR(sp->st_mode) || S_ISBLK(sp->st_mode))
-   (void)printf("%3u, %3u ",
+   (void)printf("%3u, %8u ",
major(sp->st_rdev), minor(sp->st_rdev));
else if (dp->bcfile)
(void)printf("%*s%*lld ",
-   8 - dp->s_size, "", dp->s_size,
+   13 - dp->s_size, "", dp->s_size,
(long long)sp->st_size);
else
printsize(dp->s_size, sp->st_size);

This doesn't fix the issue of double spacing, since 13 -