bug#48960: stat v8.30 - device number in decimal shown as 16bit number instead of to converted 8bit

2021-06-21 Thread Pádraig Brady

On 20/06/2021 22:38, Paul Eggert wrote:

Thanks for writing that patch. One minor note:

On 6/20/21 7:21 AM, Pádraig Brady wrote:


+(to_uchar (mod_char) << CHAR_BIT)
+ + to_uchar (fmt_char),


Neither mod_char nor fmt_char can be negative (this is guaranteed by the
C standard since all the relevant constants are in the basic character
set) so the to_uchar calls are unnecessary.

Also, this code assumes that 2 * CHAR_BIT <= MIN (INT_WIDTH,
UINT_WIDTH), something that POSIX requires but the C standard does not;
it'd be a bit safer (if pedantic) to add 'verify (2 * CHAR_BIT <= MIN
(INT_WIDTH, UINT_WIDTH));'.

I'd also change print_stat's arg from unsigned int to int; if you did
that, you could change the above 'MIN (INT_WIDTH, UINT_WIDTH)' to plain
'INT_WIDTH'. (These days we're negative on unsigned types anyway, for
all the usual reasons)

Better yet, pass two char args to print_stat instead of a single int
portmanteau.


Yes two separate char arguments is cleaner.
I thought there was some reason for the existing unsigned int usage,
but looking I see it was added for commit db42ae78 (to support :X formats),
but then became unneeded with commit c7375c23 (to instead support %.9X formats).

I've pushed both commits now, so marking this as done,

thanks for the review!
Pádraig





bug#48960: stat v8.30 - device number in decimal shown as 16bit number instead of to converted 8bit

2021-06-20 Thread Paul Eggert

Thanks for writing that patch. One minor note:

On 6/20/21 7:21 AM, Pádraig Brady wrote:


+(to_uchar (mod_char) << CHAR_BIT)
+ + to_uchar (fmt_char),


Neither mod_char nor fmt_char can be negative (this is guaranteed by the 
C standard since all the relevant constants are in the basic character 
set) so the to_uchar calls are unnecessary.


Also, this code assumes that 2 * CHAR_BIT <= MIN (INT_WIDTH, 
UINT_WIDTH), something that POSIX requires but the C standard does not; 
it'd be a bit safer (if pedantic) to add 'verify (2 * CHAR_BIT <= MIN 
(INT_WIDTH, UINT_WIDTH));'.


I'd also change print_stat's arg from unsigned int to int; if you did 
that, you could change the above 'MIN (INT_WIDTH, UINT_WIDTH)' to plain 
'INT_WIDTH'. (These days we're negative on unsigned types anyway, for 
all the usual reasons)


Better yet, pass two char args to print_stat instead of a single int 
portmanteau.






bug#48960: stat v8.30 - device number in decimal shown as 16bit number instead of to converted 8bit

2021-06-20 Thread Pádraig Brady

On 12/06/2021 16:19, Pádraig Brady wrote:

We could take the FreeBSD approach which is:

%Hd  major device number in decimal (st_dev)
%Ld  minor device number in decimal (st_dev)
%Hr  major device type in decimal (st_rdev)
%Lr  minor device type in decimal (st_rdev)

Note I'd be inclined to not have a space between major,minor in default stat 
output,
as there should be no ambiguity with locale formatted numbers, which could be
the case with ls. For example, my ls alias uses thousands grouping with:
alias ls="BLOCK_SIZE=\'1 ls --color=auto".
I.e. the default format would from stat would be
Device: %Hd,%Ld  . Device type: %Hr,%Lr

For consistency if we provided the above we should also probably provide:

   %r (composed) device type in decimal (st_rdev)
   %R (composed) device type in hex (st_rdev)


The attached adds the new formats.
I'll follow up with another to adjust the default output accordingly.

cheers,
Pádraig
>From 013ba07f19a209685331cac3f8202591b5b90dad Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sun, 20 Jun 2021 15:16:49 +0100
Subject: [PATCH] stat: support more device number representations

In preparation for changing the default device number
representation (to decomposed decimal), provide more
formatting options for device numbers.

These new (FreeBSD compat) formatting options are added:

   %Hd  major device number in decimal (st_dev)
   %Ld  minor device number in decimal (st_dev)
   %Hr  major device type in decimal (st_rdev)
   %Lr  minor device type in decimal (st_rdev)
   %r   (composed) device type in decimal (st_rdev)
   %R   (composed) device type in hex (st_rdev)

* doc/coreutils.texi (stat invocation): Document new formats.
* src/stat.c (print_it): Handle the new %H and %L modifiers.
(print_stat): Handle any modifiers and the new 'r' format.
(usage): Document the new formats.
Addresses https://bugs.gnu.org/48960
---
 doc/coreutils.texi | 14 
 src/stat.c | 57 --
 2 files changed, 60 insertions(+), 11 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 3e3aedb0f..2b69e1a2a 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -12506,8 +12506,10 @@ The valid @var{format} directives for files with @option{--format} and
 @item %b - Number of blocks allocated (see @samp{%B})
 @item %B - The size in bytes of each block reported by @samp{%b}
 @item %C - The SELinux security context of a file, if available
-@item %d - Device number in decimal
-@item %D - Device number in hex
+@item %d - Device number in decimal (st_dev)
+@item %D - Device number in hex (st_dev)
+@item %Hd - Major device number in decimal
+@item %Ld - Minor device number in decimal
 @item %f - Raw mode in hex
 @item %F - File type
 @item %g - Group ID of owner
@@ -12519,6 +12521,10 @@ The valid @var{format} directives for files with @option{--format} and
 @item %N - Quoted file name with dereference if symbolic link (see below)
 @item %o - Optimal I/O transfer size hint
 @item %s - Total size, in bytes
+@item %r - Device type in decimal (st_rdev)
+@item %R - Device type in hex (st_rdev)
+@item %Hr - Major device type in decimal (see below)
+@item %Lr - Minor device type in decimal (see below)
 @item %t - Major device type in hex (see below)
 @item %T - Minor device type in hex (see below)
 @item %u - User ID of owner
@@ -12543,8 +12549,8 @@ The @samp{%N} format can be set with the environment variable
 the default value is @samp{shell-escape-always}.  Valid quoting styles are:
 @quotingStyles
 
-The @samp{%t} and @samp{%T} formats operate on the st_rdev member of
-the stat(2) structure, and are only defined for character and block
+The @samp{r}, @samp{R}, @samp{%t}, and @samp{%T} formats operate on the st_rdev
+member of the stat(2) structure, and are only defined for character and block
 special files.  On some systems or file types, st_rdev may be used to
 represent other quantities.
 
diff --git a/src/stat.c b/src/stat.c
index 56e4867b9..8c9c69e7c 100644
--- a/src/stat.c
+++ b/src/stat.c
@@ -1144,11 +1144,12 @@ print_it (char const *format, int fd, char const *filename,
 case '%':
   {
 size_t len = format_code_offset (b);
-char const *fmt_char = b + len;
+char fmt_char = *(b + len);
+char mod_char = 0;
 memcpy (dest, b, len);
 b += len;
 
-switch (*fmt_char)
+switch (fmt_char)
   {
   case '\0':
 --b;
@@ -1156,15 +1157,32 @@ print_it (char const *format, int fd, char const *filename,
   case '%':
 if (1 < len)
   {
-dest[len] = *fmt_char;
+dest[len] = fmt_char;
 dest[len + 1] = '\0';
 die (EXIT_FAILURE, 0, _("%s: invalid directive"),
  quote (dest));
   }
   

bug#48960: stat v8.30 - device number in decimal shown as 16bit number instead of to converted 8bit

2021-06-12 Thread Pádraig Brady

On 11/06/2021 18:11, Paul Eggert wrote:

The stat output is confusing in other ways. For example:

507-day $ ls -ld .; stat . | grep Device
drwxr-xr-x 4 eggert eggert 12288 May  5 14:48 .
Device: 10300h/66304d   Inode: 70388429Links: 4
508-day $ ls -ld /dev/ptp0; stat /dev/ptp0 | grep Device
crw--- 1 root root 246, 0 Jun  3 13:09 /dev/ptp0
Device: 5h/5d   Inode: 321 Links: 1 Device type: f6,0

As you write, that "66304d" is useless on my platform, and the "5h/5d"
uses a completely different notation from the "f6,0".

I suggest that we change the behavior of both "Device:" and "Device
type:" to be consistent with that of 'ls', so that the output becomes:

507-day $ ls -ld .; stat . | grep Device
drwxr-xr-x 4 eggert eggert 12288 May  5 14:48 .
Device: 259, 0  Inode: 70388429Links: 4
508-day $ ls -ld /dev/ptp0; stat /dev/ptp0 | grep Device
crw--- 1 root root 246, 0 Jun  3 13:09 /dev/ptp0
Device: 0, 5Inode: 321 Links: 1 Device type: 246, 0


Yes I agree with this.
I.e. it's most useful to present decomposed decimal IDs.

What we have now in format specifiers is also inconsistent/incomplete:

 %d (composed) device number in decimal (st_dev)
 %D (composed) device number in hex (st_dev)
 %t major device type in hex, for device special files (st_rdev)
 %T minor device type in hex, for device special files (st_rdev)

I.e. no decomposed st_dev, or no decimal st_rdev.
I think it's more important to provide decomposed numbers,
rather than hex representations (especially decomposed hex).
It's probably best not to change any of the above format specifiers though
for compat reasons. %d while not that useful in isolation, may be used
to distinguish files or devices, which might be persisted in manifests etc.

We could take the FreeBSD approach which is:

  %Hd  major device number in decimal (st_dev)
  %Ld  minor device number in decimal (st_dev)
  %Hr  major device type in decimal (st_rdev)
  %Lr  minor device type in decimal (st_rdev)

Note I'd be inclined to not have a space between major,minor in default stat 
output,
as there should be no ambiguity with locale formatted numbers, which could be
the case with ls. For example, my ls alias uses thousands grouping with:
alias ls="BLOCK_SIZE=\'1 ls --color=auto".
I.e. the default format would from stat would be
Device: %Hd,%Ld  . Device type: %Hr,%Lr

For consistency if we provided the above we should also probably provide:

 %r (composed) device type in decimal (st_rdev)
 %R (composed) device type in hex (st_rdev)

cheers,
Pádraig





bug#48960: stat v8.30 - device number in decimal shown as 16bit number instead of to converted 8bit

2021-06-11 Thread L A Walsh

On 2021/06/11 00:37, Wolfgang Rohm wrote:

Hello.

Stat prints the device number, major and minor, in hex and decimal. 
They are both 8bit numbers clamped together. While the hex number is 
perfectly fine, the decimal doesn't respect how this number has come 
to existence. There is no meaning in the decimal value, if the the hex 
value is taken as one 16bit number and then converted.


For example "fd00h" is converted to "64768d". There is no major device 
with 647 and no minor device with 768. Its just completely wrong.

---
   Aside from complete bogus output on my system for either
hex or decimal and the "seemingly" undocumented appearance of 'd'
to convert a 16-bit decimal value back to 2 8-bit values
divide the number by 256 or shift the value right by 8 bits
for the high number and either mod the number with 256 or
'and' (&) the value with 0xff:

decimal_devno=64768   #(dropping the 'd' on the end)
high=$(($decimal_devno/256))
low=$(($decimal_devno%256))
printf "%s:%s\n" "$high" "$low"

 (output:)
253:0

 or via shifts+masks

decimal_devno=64768
high=$((decimal_devno>>8))
low=$((decimal_devno & 0xff))
printf "%s:%s\n" "$high" "$low"

 (output:)
253:0










bug#48960: stat v8.30 - device number in decimal shown as 16bit number instead of to converted 8bit

2021-06-11 Thread Paul Eggert

The stat output is confusing in other ways. For example:

507-day $ ls -ld .; stat . | grep Device
drwxr-xr-x 4 eggert eggert 12288 May  5 14:48 .
Device: 10300h/66304d   Inode: 70388429Links: 4
508-day $ ls -ld /dev/ptp0; stat /dev/ptp0 | grep Device
crw--- 1 root root 246, 0 Jun  3 13:09 /dev/ptp0
Device: 5h/5d   Inode: 321 Links: 1 Device type: f6,0

As you write, that "66304d" is useless on my platform, and the "5h/5d" 
uses a completely different notation from the "f6,0".


I suggest that we change the behavior of both "Device:" and "Device 
type:" to be consistent with that of 'ls', so that the output becomes:


507-day $ ls -ld .; stat . | grep Device
drwxr-xr-x 4 eggert eggert 12288 May  5 14:48 .
Device: 259, 0  Inode: 70388429Links: 4
508-day $ ls -ld /dev/ptp0; stat /dev/ptp0 | grep Device
crw--- 1 root root 246, 0 Jun  3 13:09 /dev/ptp0
Device: 0, 5Inode: 321 Links: 1 Device type: 246, 0





bug#48960: stat v8.30 - device number in decimal shown as 16bit number instead of to converted 8bit

2021-06-11 Thread Wolfgang Rohm

Hello.

Stat prints the device number, major and minor, in hex and decimal. They 
are both 8bit numbers clamped together. While the hex number is 
perfectly fine, the decimal doesn't respect how this number has come to 
existence. There is no meaning in the decimal value, if the the hex 
value is taken as one 16bit number and then converted.


For example "fd00h" is converted to "64768d". There is no major device 
with 647 and no minor device with 768. Its just completely wrong.


So instead of converting fd00h to 64768d, both 8bit hex numbers must be 
converted each, meaning alone. The right number would be therefore 
253000, major 253 and minor 0. While the hex number needs two digits 
each (the first 0 can be ommited), the decimal needs three digits each 
(the first and second 0 can be omitted).


|stat (GNU coreutils) 8.30
Copyright © 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
.

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Geschrieben von Michael Meskes.
|

--
Mit freundlichen Grüßen
Wolfgang Rohm