Magic Banana suggests that I cite a few ferinstances:

I've changed a couple of the filenames to separate them in my tenuous logic:
First script:
grep ":" IPv6-SourceList.txt | sort | uniq -c | awk '{print $2}' '-' > IPv6-List.txt

Second script:
grep -v ":" IPv6-SourceList.txt | sort | uniq -c | awk '{print $2}' '-' > NoIPv6-List.txt
#note: IPv6-SourceList.txt originally had only one IPv4: 2.63.83.182

Fourth script:
grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' NoIPv6-List.txt > IPv4-List.txt Ref: https://superuser.com/questions/202818/what-regular-expression-can-i-use-to-match-an-ip-address #note: Now there are quite a few (53) additional IPv4's that have crept in because of the weakness of the script.

I also tried a "two-minute drill" by splitting all the addresses on the dots "." and reassembling the first four octets with a $ as the separator between the fourth octet and the hostname-remnant in the fifth column: awk '{print $1}' 'IPv4-SourceList.txt' | sed 's/\./\t/g' '-' | awk '{print $1"."$2"."$3"."$4"$"$5}' '-' > IPv4-List.txt

At this writing I'm stumped by the task of sorting the $-separated file to capture just the rows containing proper IPv4 addresses. This script at least shouldn't snatch any IPv4 prefixes from the dot-separated PTR's.

Note: IPv6-SourceList.txt and IPv4-SourceList.txt were each extracted from the same original multi-megabyte
source file.

George Langford

Reply via email to