Hello! Given a text-file "sort.but.txt" with find-output like this: 07. Feb 2015 15:57 ./mess.jpg 05. Mär 2015 13:30 ./mess.jpg
Basically two columns: a date and a filename
I want sort to discard the duplicate lines for the same file using -u to keep
only the first and -k
to skip over the date column
> sort sort.bug.txt -u -s -k 1.20 --debug
sort: es werden die Sortierregeln für »de_DE.UTF-8“ verwendet
sort: führende Leerzeichen sind signifikant in Schlüssel 1: Sie sollten daher
wahrscheinlich auch „b“ angeben
05. Mär 2015 13:30 ./mess.jpg
___________
07. Feb 2015 15:57 ./mess.jpg
__________
As the underlines in debug mode show, the keys start position depends on
whether the month
name contains pure ASCII or the German Umlaut ä.
There's a hint coming up, to apply option -b as this one character offset could
possibly be
overcome thanks to the separating whitespace between the columns.
> sort sort.bug.txt -u -s -k 1.20 -b --debug
sort: es werden die Sortierregeln für »de_DE.UTF-8“ verwendet
05. Mär 2015 13:30 ./mess.jpg
__________
07. Feb 2015 15:57 ./mess.jpg
__________
In fact, it does correct the underlines, but still -u gives both lines, though
I want it to discard the
second line. You can add more lines for the same file, but sort insists on
keeping exactly two: one
with Umlaut and the other without.
This is: sort (GNU coreutils) 8.23
Thanks for the great utilities.
Holger
--
|_|/ MfG
| |\ Holger Klene
PGP Key ID: 0x22FFE57E
signature.asc
Description: This is a digitally signed message part.
