On Fri, Apr 17, 2015 at 11:26 AM, Eric Blake <[email protected]> wrote: > On 04/17/2015 10:10 AM, Peng Yu wrote: >> Hi, I got the following results when I call sort with -t /. It seems >> that 'a/1.txt' should be right after 'a'. Is it the case? Or I am not >> using sort correctly? > > Your assumption is correct - you are using sort incorrectly, by failing > to take locales into account, and by failing to limit the amount of data > being compared to single field widths.
Thanks for the explanation. If I don't know the number of fields, but I want to sort according to all fields (from 1 to whatever the max number of fields), is there a way to do it? >> $ printf '%s\n' a 'a!' ab aB a/1.txt | sort -t / -k 1 -k 2 -k 3 -k 4 >> a >> a! >> a/1.txt >> aB >> ab > > sort --debug is your friend: > > $ printf '%s\n' a 'a!' ab aB a/1.txt | sort --debug -t / -k 1 -k 2 -k 3 -k 4 > sort: using ‘en_US.UTF-8’ sorting rules > a > _ > ^ no match for key > ^ no match for key > ^ no match for key > _ > a! > __ > ^ no match for key > ^ no match for key > ^ no match for key > __ > a/1.txt > _______ > _____ > ^ no match for key > ^ no match for key > _______ > ab > __ > ^ no match for key > ^ no match for key > ^ no match for key > __ > aB > __ > ^ no match for key > ^ no match for key > ^ no match for key > __ > > > As shown in the debug trace, the line 'a!' sorts prior to the line > 'a!1.txt' because your first sort key is the entire line, and in the > locale you are using (where both '!' and '/', and also '.', are ignored > in collation orders), the collation string "a" really does come before > "a1txt". > > What you REALLY want is to limit your sorting to a single field at a > time (-k1,1 rather than -k), as in: > > $ printf '%s\n' a 'a!' ab aB a/1.txt | sort --debug -t / -k 1,1 -k 2,2 > sort: using ‘en_US.UTF-8’ sorting rules > a > _ > ^ no match for key > _ > a/1.txt > _ > _____ > _______ > a! > __ > ^ no match for key > __ > ab > __ > ^ no match for key > __ > aB > __ > ^ no match for key > __ > > > Or additionally, to limit your sorting to a locale that does not discard > punctuation as unimportant, as in: > > $ printf '%s\n' a 'a!' ab aB a/1.txt | LC_ALL=C sort --debug -t / -k 1,1 > -k 2 > sort: using simple byte comparison > a > _ > ^ no match for key > _ > a/1.txt > _ > _____ > _______ > a! > __ > ^ no match for key > __ > aB > __ > ^ no match for key > __ > ab > __ > ^ no match for key > __ > > > -- > Eric Blake eblake redhat com +1-919-301-3266 > Libvirt virtualization library http://libvirt.org > -- Regards, Peng
