I executed that, out of curiosity. It essentially does the same as the command line I have given since the beginning of this thread: $ sort -u IPv4.May2020.37.nMapoG.txt | awk 'FILENAME == ARGV[1] { a[$1] = $2 } FILENAME == ARGV[2] && $1 in a { print $2 >> "out/" a[$1] "," $1 }' PTRList.txt -

"Essentially" because:

Your solution outputs one file per PTR in PTR-files, which always contains one single line: the PTR (which is also in the file name); what is the point? The files output in CountsFiles contain duplicates, e.g., "low.lowe001.net 96.125.160.252" is twice in CountsFiles/low.lowe001.net.2.txt; in your original post, you used 'sort -u IPv4.May2020.37.nMapoG.txt' to remove duplicates: that is why I did the same in my solution; Every line has two fields, but the first one is always the same PTR (which is also in the file name); what is the point?
Executing it takes 0.3s on my system, against 0.01s for mine;
'ls -v' would not sort the files in CountsFiles by "number of instances"; in your original post, your use of 'sort -nrk 2' (the argument should have been 2,2) suggests you want to sort by "number of instances"; that is why the names of the output files in my solution start with the "number of instances".


Now, if you want the duplicates, if you insist on the names you chose and if you really want the repeated PTR in a first field (what only looks like a waste of disk space), it is trivial to adapt my solution: awk 'FILENAME == ARGV[1] { a[$1] = $2 } FILENAME == ARGV[2] && $1 in a { print $1, $2 >> "CountsFiles/" $1 "." a[$1] ".txt" }' PTRList.txt IPv4.May2020.37.nMapoG.txt

One single program. Against more than a dozen to create your Script.* files that then execute 252 other commands (more generally: 6 times the number of PTRs).

Reply via email to