On a somewhat off-topic note, Francesco Bettella wrote, On 02/02/2011 07:42 AM: > > I'm issuing the following sort commands (see attached files): > [prompt1] > sort -k 1.4,1n asd1 > asd1.sorted > [prompt2] > sort -k 2.4,2n asd2 > asd2.sorted > > the first one works as I would expect, the second one doesn't.
When sorting chromosome names, the version sort option (-V, introduced in coreutils 7.0) sorts as you would expect, saving you the need to skip three characters in the sort key, and also accommodating mixing letters and numbers. Example: $ cat chrom.txt chr1 chrUn_gl000232 chrY chr2 chr13 chrM chrUn_gl000218 chr6_hap chr2R chr16 chr10 chr6_dbb_hap3 chr4 chr3L chr4_ctg9_hap1 chr3R chr3 chrX $ sort -k1,1V chrom.txt chr1 chr2 chr2R chr3 chr3L chr3R chr4 chr4_ctg9_hap1 chr6_dbb_hap3 chr6_hap chr10 chr13 chr16 chrM chrUn_gl000218 chrUn_gl000232 chrX chrY -gordon
