2010/6/8 Pádraig Brady <[email protected]>: > On 07/06/10 06:19, Alex Shinn wrote: >> >> Ideally join should be able to handle files sorted in any order >> that sort provides, but as a bare minimum it should at least >> be able to join files sorted on numeric fields. > > Well if there were no aliases in the numbers, you could always > sort the output numerically after the join if it was important.
By first sorting lexicographically, you mean? In the use case I had, the data was already sorted numerically. So whenever I want to join two files, currently I have to do: sort file1 > file1.tmp sort file2 > file2.tmp join file1.tmp file2.tmp | sort -n > out rm -f file1.tmp file2.tmp instead of just join -n file1 file2 > out In the small tools philosophy you want to avoid adding redundancy, but in this case join isn't doing the same thing as sort, it's just working with it better. Not to mention the fact that sort is an expensive operation to have to perform multiple times, not just an extra O(n) filter to throw in the middle of a pipeline. > However if you wanted to join "01" and "1" then your patch is required. > Are numeric aliases common enough to warrant this? I think so. Leading zeros may not be so common, but don't forget "1.0" and "1" or "1e2" and "100" and "100.0", etc. > I'd use -g, --general-numeric to correspond with `sort`. Yes, that's probably better. -- Alex
