You probably want to use a trie <https://en.wikipedia.org/wiki/Trie> for
this – I found several available Python implementations, but I don’t
know what their advantages or disadvantages are, so I’ll just list them
in alphabetical order:

  * cidr-tree <https://github.com/Figglewatts/cidr-trie>
  * py-radix <https://github.com/Figglewatts/cidr-trie>
  * pysubnettree <https://github.com/zeek/pysubnettree>
  * pytricia <https://github.com/jsommers/pytricia>

Cheers,
Lucas

On 12.07.19 04:43, Huji Lee wrote:
> Hi all,
>
> I am working on a bot that fetches a list of anonymous editors on
> fawiki, uses WHOIS to retrieve more info about that IP, and uses a
> number of online APIs to check if the IP is a proxy or not.[1]
>
> I would like to improve the code by implementing a CIDR cache, so that
> if I run whois on 8.4.4.8 and determine that its ASN range is
> 8.4.4.0/24 <http://8.4.4.0/24> and then I encounter 8.4.4.9 in the
> next iteration of my for loop, I would quickly determine this IP also
> belongs to the same range and skip the WHOIS part for it.
>
> The search space would consist of IP ranges like "8.4.4.0 - 8.4.4.25"
> (these are the beginning and end IP addresses of the 8.4.4.0/24
> <http://8.4.4.0/24> range). Obviously, we can convert these IPs to Hex
> to make comparisons easier. Given an IP like 8.4.4.9, we need the
> object to efficiently determine if it already has an IP range that
> encompasses this given IP and if so, return the previously cached
> details for that IP pair. If not, we will store that in cache.
>
> The part that I am not fully clear about is the following: how can I
> avoid having to loop through every range in the cache? Is there a way
> to create a hash function that checks two inequality comparisons
> efficiently?
>
> Thanks!
>
> Huji
>
> [1]
> https://github.com/PersianWikipedia/fawikibot/blob/master/HujiBot/findproxy.py
>
> _______________________________________________
> pywikibot mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/pywikibot
_______________________________________________
pywikibot mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot

Reply via email to