bug#29044: sort --debug results improvement

2017-10-28 Thread Assaf Gordon

tag 29044 notabug
close 29044
thanks

Hello,

There are few issues at hand. Answering out of order:

> $ sort -k 2n -k 3n --debug file.txt
[...]
> Also the user is confused if
> 
> is a "key 3", or just a separator.
>
> Therefore please say
> ": key 1" or "1" etc. at the end of each of them.
> This is also important if there many keys.
>
> And add a separator bar, made of -, =, etc. but not _.

This is indeed a 3rd key: it is the default behavior
of the 'last resort' sorting by the entire line.
It is not a separator.

It is used to sort lines for which the specified keys are equal.
It can be disabled with "-s/--stable" option.

Consider the following:

Case 1: The first key is equal ("A" in both lines).
Sort then uses the last resort sorting and compares the entire
lines, making "A B" appear first:

  $ printf "%s\n" "A C" "A B" | sort --debug -k1,1
  A B
  _
  ___
  A C
  _
  ___


Case 2: Using "-s" disable last-resort, and lines with equal keys
are printed in the same order they were specified (hence "stable"):

  $ printf "%s\n" "A C" "A B" | sort --debug -k1,1 -s
  A C
  _
  A B
  _




On 2017-10-28 11:26 AM, Dan Jacobson wrote:

$ sort -k 2n -k 3n --debug file.txt
sort: using simple byte comparison
sort: key 1 is numeric and spans multiple fields
sort: key 2 is numeric and spans multiple fields
41 011 92.3 亞太
___


41 011 97.1 大漢
___


OK but they look like they only span one field.


'sort --debug' will indicate the *actual* characters
that were used for the comparison.
In case of "-n" (numeric sort), the conversion to a numeric value
stopped at the space character, and it is indicated so.

This has nothing to do with the fact that the key specification
spans multiple fields for a single numeric key.


Consider the following cases (I'm using "-s" for all cases to
reduce clutter, it doesn't change the meaning):

Case 1: Because we used alphanumeric sorting order (the default),
All the characters until the first space are marked by "--debug":

  $ printf "%s\n" "11A A" "33 C" "4e4D D" | sort -k1,1 --debug -s
  11A A
  ___
  33 C
  __
  4e4D D
  


Case 2: with numeric sorting, only the digits are marked:

  $ printf "%s\n" "11A A" "33 C" "4e4D D" | sort -k1n,1 --debug -s
  4e4D D
  _
  11A A
  __
  33 C
  __


case 3: if using "-g" (general numeric sort, which can parse scientific 
notation) the "4e4" is parsed, but stopped at the "D" character:


  $ printf "%s\n" "11A A" "33 C" "4e4D D" | sort -s -k1g,1 --debug
  11A A
  __
  33 C
  __
  4e4D D
  ___




Also the Info documentation doesn't mention how to inflence
"sort: using simple byte comparison"
which seems to always be printed when using --debug no matter what.


This message indicates you are sorting in the C/POSIX locale.
Perhaps it is the default locale on your system ?

"sort --debug" will always print the sorting rules, e.g.:

  $ LC_ALL=en_CA.UTF-8 sort --debug < /dev/null
  sort: using ‘en_CA.UTF-8’ sorting rules

  $ LC_ALL=C sort --debug < /dev/null
  sort: using simple byte comparison





As such,
I'm marking this item as not-a-bug and closing it, but discussion can 
continue by replying to this thread.


regards,
 - assaf









bug#29044: sort --debug results improvement

2017-10-28 Thread Dan Jacobson
$ sort -k 2n -k 3n --debug file.txt
sort: using simple byte comparison
sort: key 1 is numeric and spans multiple fields
sort: key 2 is numeric and spans multiple fields
41 011 92.3 亞太
   ___
   

41 011 97.1 大漢
   ___
   

OK but they look like they only span one field.

Also the user is confused if

is a "key 3", or just a separator.

Therefore please say
": key 1" or "1" etc. at the end of each of them.
This is also important if there many keys.

And add a separator bar, made of -, =, etc. but not _.

Also the Info documentation doesn't mention how to inflence
"sort: using simple byte comparison"
which seems to always be printed when using --debug no matter what.





bug#29038: df hangs on fifos/named pipes

2017-10-28 Thread Stephane Chazelas
test case:

   mkfifo p
   df p

That hangs, unless you make "p" non-readable or some other process
has the fifo open in write mode.

The reason is that df tries to open the fifo in read-only mode,
according to comments in the source code so as to trigger a
potential automout.

That goes back to this commit:

> commit dbd17157d7e693b8de9737f802db0e235ff5a3e6
> Author: Tomas Smetana 
> Date:   Tue Apr 28 11:21:49 2009 +0200
> 
> df: use open(2), not stat, to trigger automounting
> 
> * src/df.c (main): When iterating over command-line arguments,
> attempting to ensure each backing file system is mounted, use
> open, not stat.  stat is no longer sufficient to trigger
> automounting, in some cases.  Based on a suggestion from Ian Kent.
> More details in http://bugzilla.redhat.com/497830

More info at the bugzilla link.

It's arguable whether df, a reporting tool, should have such a
side effect as automounting a file system.

The fifo issue though is a bug IMO, especially considering that
POSIX explicitely says that df should work on fifos.

Here, it may be enough to add the O_DIRECTORY flag to open()
where available if we only care about automounting files of type
directory (or portably use opendir()).

Or use O_PATH  on Linux 3.6+ followed by openat() on non-fifos if
open(O_PATH) is not enough to trigger the automount in the
unlikely event we care about automounting non-directory files
(and report their disk usage).

Or not open() at all, and not automount file systems.

Note that busybox, heirloom or ast-open df implementations on
Linux don't have the problem (and presumably don't automount
file systems). Nor does FreeBSD.

Reproduced with:

$ df --version
df (GNU coreutils) 8.25

and:

$ df --version
df (GNU coreutils) 8.27.46-e13fe

That was discovered by Martijn Dekker, CCed, when looking for a
portable way to identify the file system of an arbitrary file.

-- 
Stephane





bug#29012: od: busy skip on block devices

2017-10-28 Thread Christian Kögler


Am 27. Oktober 2017 07:25:25 MESZ schrieb "Pádraig Brady" :
>On 26/10/17 08:13, Christian Kögler wrote:
>> If od is used on block devices together with skip, od reads the
>skipped bytes instead of seeking it.
>
>Yes it has done that from the initial version.
>Note od concatenates multiple files, and skips across
>all of them, so consequently restricts itself to
>seeking where it knows the file length (regular files).
>
>I suppose we could try to seek if a single argument was specified.
>
>As an alternative you could skip with dd and pipe to od?
I loose the absolute address offset, but in our case a way to go.
Thanks for helping so quickly!

Cheers
Christian