bug#21916: sort -u drops unique lines with some locales

2015-11-16 Thread Bob Proulx
Pádraig Brady wrote: > Christoph Anton Mitterer wrote: > > Attached is a file, that, when sort -u'ed in my locale, looses lines > > which are however unique. > > > > I've also attached the locale, since it's a custom made one, but the > > same seem to happen with "standard" locales as well, see

bug#21916: sort -u drops unique lines with some locales

2015-11-15 Thread Christoph Anton Mitterer
Hey Pádraig On Sat, 2015-11-14 at 11:06 +, Pádraig Brady wrote: > Unfortunately the roman numeral code points compare equal: > >   $ printf '%s\n' Ⅱ Ⅰ | ltrace -e strcoll sort >   sort->strcoll("\342\205\241", "\342\205\240") = 0 >   Ⅱ >   Ⅰ > > If you compare at the byte level you'll get

bug#21916: sort -u drops unique lines with some locales

2015-11-15 Thread Christoph Anton Mitterer
Oh one further solution: - document more properly in the manpage and --help, what -u really is, and especially that it may not behave as expected, with other locales/collations. Perhaps even giving an example, so that people understand the seriousness of that. - add companion option, maybe -U,

bug#21916: sort -u drops unique lines with some locales

2015-11-14 Thread Pádraig Brady
tag 21916 notabug close 21916 stop On 14/11/15 05:38, Christoph Anton Mitterer wrote: > Hey. > > (GNU coreutils 8.23) > > Attached is a file, that, when sort -u'ed in my locale, looses lines > which are however unique. > > I've also attached the locale, since it's a custom made one, but the >