Hi, recently I needed to merge some really big text files (several million lines altogether, the result file was >1GB), joining on the first column (which is a MD5 base64 hash key) and printing both unique lines and concatenated joined lines. The command looked like:
join -a 1 -a 2 -t \t file1 file2 > file3 Since I had about 20 or more files to merge, I reduced their number by subsequent pairwise joining until there was only one result file left (would be nice to have something like multi-way merging available here...). However, joining only worked correctly with textutils 2.0a; later versions (e.g. 2.0.13, 2.0.16) did leave some duplicate keys. Running 'diff' or 'wc -l' on the sorted and later unified keys identified some 100 lines difference... I can reproduce this, however, I haven't identfied what goes wrong nor did I dive into the sources ... maybe I can have a closer look at this someday... regards Christian _______________________________________________ Bug-textutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-textutils