Re: filename pattern case-insensitive, but why?
On Tue, Sep 22, 2009 at 02:36:30AM -0700, thahn01 wrote: Hello, If I try something like: $ touch a.c b.c A.c $ ls [a-z]*.c a.c A.c b.c then I get A.c in the output, even if no capital letters are to be found. The [a-z] range expression matches characters between a and z in the current locale's collation order. The collation order for en_US.UTF-8 and other locales has uppercase and lowercase alphabetic characters together. So in those locales your range includes 'a' through 'z' and 'A' through 'Y'. You can change the locale to C or POSIX to get plain ascii collation order. You can see the collation order using the sort command. for c in {32..126}; do eval printf '%c - %d\n' $(printf $'%o' $c) $c;done | sort -k 1.1,1.1 for c in {32..126}; do eval printf '%c - %d\n' $(printf $'%o' $c) $c;done | LANG=C sort -k 1.1,1.1 The collation order lists 'a' before 'A', but actually lets a later character break a tie between otherwise equal uppercase and lowercase characters. Sort will arrange 'a1', 'A1', 'a2', and 'A2' with the '1' vs. '2' characters acting as a tiebreaker. -- Mike Stroyan m...@stroyan.net
Re: filename pattern case-insensitive, but why?
Mike Stroyan wrote: On Tue, Sep 22, 2009 at 02:36:30AM -0700, thahn01 wrote: Hello, If I try something like: $ touch a.c b.c A.c $ ls [a-z]*.c a.c A.c b.c then I get A.c in the output, even if no capital letters are to be found. The [a-z] range expression matches characters between a and z in the current locale's collation order. The collation order for en_US.UTF-8 and other locales has uppercase and lowercase alphabetic characters together. So in those locales your range includes 'a' through 'z' and 'A' through 'Y'. You can change the locale to C or POSIX to get plain ascii collation order. You can see the collation order using the sort command. for c in {32..126}; do eval printf '%c - %d\n' $(printf $'%o' $c) $c;done | sort -k 1.1,1.1 for c in {32..126}; do eval printf '%c - %d\n' $(printf $'%o' $c) $c;done | LANG=C sort -k 1.1,1.1 The collation order lists 'a' before 'A', but actually lets a later character break a tie between otherwise equal uppercase and lowercase characters. Sort will arrange 'a1', 'A1', 'a2', and 'A2' with the '1' vs. '2' characters acting as a tiebreaker. ...and that it is why instead of using $ ls [a-z]*.c you should use $ ls [[:lower:]]*.c