Hello,
(apologies in advance for a possibly repeated bug report -- it is hard
to check for duplicates on collating sequence bugs.)

The following bash script illustrates the issue.
Thanks

#!/bin/bash

cat << EOF > a
a...
b...
b1..
ZZ..
EOF

cat << EOF > b          # like 'a', but with some digits at the end
a...111
b...111
b1..111
ZZ..111
EOF

cat << EOF > c          # like 'b', but with the digits 123 instead of 111
a...123
b...123
b1..123
ZZ..123
EOF

echo "Is 'a' sorted?"
sort a | diff - a       # no differences
echo "Is 'b' sorted?"
sort b | diff - b       # no differences
echo "Is 'c' sorted?"
sort c | diff - c       # differences! why?

echo
echo 'cat c:            (note: b... b1..)'
cat c
# produces:
# a...123
# b...123
# b1..123
# ZZ..123

echo
echo 'sort c:           (note: b1.. b...)'
sort c
# produces:
# a...123
# b1..123
# b...123
# ZZ..123

# issue worked around by exporting LC_ALL=C
# (which, of course, changes the ordering entirely,
#  f.i. uppercase ZZ will come before the lowercase words)

Reply via email to