Sorry for the slow replies, our phone landline +internet is dead and the
telco [TSX: MBT] won't fix it for a week. Terrible for getting work done.
> Cool. I'd split dnsbl_zones into ipv4_dnsbl_zones and ipv6_dnsbl_zones
> and have the DnsblZones directive work like;
That's a good idea, I suspected IPv6 RBLs might exist :) I'll add the IPv6
> dnsbl_lookup_query() takes an IP address argument as a string, but it
> would probably be a lot better to take it as an apr_sockaddr_t, since
> that's an IP version agnostic format, and is generally the way an Apache
> module would have the address available to it.
The problem this introduces is when looking up RHSBLs, which operate on host
names or domain names instead of IP address. Would you recommend different
functions for DNSBL (pass an IP) and RHSBL (pass a hostname or domain name)?
> Passing it around in binary format also helps you avoid using sscanf and
> the associated reentrancy problems on many platforms.
I did not know there were reentrancy problems with sscanf. strtok I know.
> The implementation is neat, but it could also do with efficiency being
> in mind, IME (I help run a very large RBL) rbl lookups tend to be a big
> source of latency during request/mail handling and it's worth making the
> effort to go a bit further :)
Yes, I am going to add some caching for recent queries. I thought at first
that the resolver already does this but as far as I can tell, it does not do
> Although the dnsbl_lookup_query() function's output is comprehensive,
> perhaps more useful and efficient would be to supply a framework for
> allowing modules to check DNSBL's in a boolean manner. As-is the code
> scans every registered RBL, even if one flags an address as listed.
> That's super in-efficient for the majority case, and there's no
> application level caching, which tends to be a must for most
> implementations (even if it is only per-request, like Exim's or
> sendmail's implementations for example).
I agree. What I've started can probably be taken much further but I want to
put the basic layers there first. I'll split up the code so it will be easier
to modify later to not query all at once.
> Part of the lack of boolean-checking reveals another problem, how are
> other modules supposed to know what constitutes a positive for a
> particular RBL?
What constitutes a positive depends entirely on the particular RBL's policy.
Some RBLs are whitelists themselves, so if an IP or domain matches then it
should NOT be blocked.