Magic Banana continued our previous discussion:
>> which seems rather realistic
> I thought you were saying groups of four hexadecimal digits in real-world
IPv6 addresses more often start
> with 0 than not. I extrapolated that interpretation: "yet another
distribution obeying Benford's law", I
> thought: https://en.wikipedia.org/wiki/Benford%27s_law
Quite a few very tall buildings have heights (in feet) which start with the
numeral 1. Far fewer 2's.
Buildings with heights starting in 3's have physical limitations.
On the other hand, IPv6 addresses very often start with 2's; rarely 3's ...
the address space starts with
zero within the governing base, but that was arbitrary and had (to my
knowledge) no physical limitation
such as that which occurs within the radio spectrum.
An address range such as 2a02:2788::/32 starts with zero:
Therefore, 2a02:2788:0000:0000:0000:0000:0000:0001 is the second address.
2a02:2789::/32 starts the same: 2a02:2789:0000:0000:0000:0000:0000:0001 and
may belong to another party.
2a02:2788::/31 has two cycles that look exactly like 2a02:2788::/32 and
2a02:2789::/32 even though they
are in consecutive positions along the address line of the unbounded IPv6
address line that started at
zero only once. There is no overlap and no sale of one address on the
infinite IPv6 line more than once.
The zeros in the hexadecimal version of the decimal
79,228,162,514,264,337,593,543,950,336 are fewer in
number but are all in the same places in the six fields of four hexadecimal
digits in 2a02:2788::/32 as
they are in 2a02:2789::/32. Think of the successive rings on a dart board,
the one labelled :000h being
the smallest and the one labelled :0hhh the largest. Our randomized looks at
the hhhh;hhhh::/32 dart board
ought to have similar relative numbers of :0000, :000h, :00hh, and :0hhh
every time.
Magic Banana also said:
> Writing the prefixes with simple commands (rather than sed), may save some
CPU cycles:
$ prefix=2a02:aa1; sample_size=1048576; od -A n -N $(expr 12 \* $sample_size)
-xw12 /dev/urandom | tr ' ' : | paste -d ''
SS.IPv6-NLU-January2020-mobile.tre.se.txt
Applied to another prefix, it finishes in three seconds; the grep's come out
:0 ==> 336,650, :00 ==> 24,393, :000 ==>
1,537 & :0000 ==> 98. Those steps come out ~15 to one. The advantage of this
script is that I can scale it readily.
The nmap script remains the rate limiting step in this exercise. I'm
gathering prefixes for a marathon nmap session,
but, with the randomized method of condensing, the impossible-to-scan
verbatim prefix=2a02:aa1::/32 can be visualized in an
hour. My laptop was formerly enduring CIDR/12's that took on the order of a
thousand hours to return a result. Internet
searches on PTR addresses gleaned from email databases remain a tedious
roadblock to the evaluation of the gratuitously
resolved addresses in the other two-thirds of published recent visitor data.
Magic Banana, grading my homework, said:
[quoting] awk 'NR >= 1 { print $5, $6 }' | awk 'NR >= 1 { print $2, $1 }'
> Read again my previous post.
Don't need to; that expedient script helped me navigate my logic, but can
also be written: awk '{print $6,$5}' efficiently
if I were to proofread it before publishing.
I corrected it and am randomizing the 2a02:2788::/32 block, which has come
back to life after being closed over the weekend:
prefix=2a02:2788; sample_size=1048576; od -A n -N $(expr 12 \* $sample_size)
-xw12 /dev/urandom | tr ' ' : | paste -d ''
SS.IPv6-NLU-January2020-host.dynamic.voo.beB.txt
nmap -6 -sn -T4 -sL -iL SS.IPv6-NLU-January2020-host.dynamic.voo.beS.txt |
grep "Nmap scan report for " - | tr -d '()' | sort -k5 | awk '{ print $6, $5
}' | uniq -Df 1 | sed '/^\s\s*/d' | awk '{ print $2 "\t" $1 }' >>
Multi-SS.IPv6-NLU-January2020-host.dynamic.voo.beS.txt
awk '{print $2,$1}' 'Multi-SS.IPv6-NLU-January2020-host.dynamic.voo.beS.txt'
| sort -k 2 | uniq -cdf 1 | awk '{print $3"\t"$1}' '-' >
Multi-SS.IPv6-NLU-January2020-host.dynamic.voo.beS.Tally.txt
The count of "host.dynamic.voo.be" came to 983,391 out of a possible
1,048,576 (93.8%)
George Langford