Re: [spamdyke-users] SERVFAIL on dns-server-ip-primary does not fail-over

2019-03-13 Thread Quinn Comendant via spamdyke-users
Hi Sam,

Thanks for the thorough reply. 

How does spamdyke know "If a response is received, [use it and stop]"? During a 
NXDOMAIN and SERVFAIL, no "response" is received (the ANSWER section of the DNS 
response is empty). If the response is empty, doesn't spamdyke then try the 
next name server? Or, can it detect that a response was received successfully, 
just empty?

A setting that rotates between name servers would be very helpful. Spamassassin 
offers this already with its `dns_options rotate` option. Distributing load 
between name servers helps them stay within the limits of DNSBL query limits, 
(i.e., URIBL_BLOCKED).

Quinn


On 13 Mar 2019 13:52:11, Sam Clippinger via spamdyke-users wrote:
> Sorry, I missed your earlier email.  I'll try to answer both questions here.
> 
> Unless you're setting spamdyke's dns-level option, it should be using 
> the primary servers in order, followed by the secondary servers in 
> order, every time it runs.  If you're just setting the three DNS 
> servers and not using any other dns-* options, the logic should look 
> like this:
>   Total DNS query time is 30 seconds (override with dns-timeout-secs)
>   Max number of DNS queries to primary servers before using 
> secondaries is 1 (override with dns-max-retries-primary)
>   Max number of DNS queries total is 3 (override with 
> dns-max-retries-total)
>   Send query packet to 127.0.0.1, wait 10 seconds for a response 
> (total query time divided by max number of queries)
>   If a response is received, use it and stop.
>   Send query packet to 10.128.0.9, wait 10 seconds for a response
>   If a response is received, use it and stop.
>   The number of queries to primary servers is greater than 1, start 
> using secondaries as well
>   Send query packet to 169.254.169.254, wait 10 seconds for a response
>   If a response is received, use it.  Otherwise exit with no response.
> Randomizing the order of the servers would probably be a good idea 
> (or option) I think I didn't do that because I was trying to 
> imitate the behavior of the system resolver library, which uses the 
> servers in /etc/resolv.conf in order every time.
> 
> Looking at the code in dns.c, spamdyke treats an empty response as 
> "not found" and doesn't check whether it was due to SERVFAIL or 
> NXDOMAIN.  If memory serves, I did this because there's no real 
> difference between them as far as spamdyke is concerned.  In other 
> words, NXDOMAIN means the domain doesn't exist at all while SERVFAIL 
> means the domain exists but no records can be found (usually because 
> the authoritative servers aren't responding).  Either way, the mail 
> should be rejected with a temporary code so the sender will try again 
> later (hoping the problem will resolve itself in the meantime).  If 
> the problem persists long enough, the message(s) may bounce.  
> Unfortunately there's no DNS code to indicate the server is 
> malfunctioning and shouldn't be used -- spamdyke expects it to stop 
> sending responses when that happens.
> 
> 
> -- Sam Clippinger
> 
> 
> 
> 
>> On Mar 11, 2019, at 6:58 PM, Quinn Comendant via spamdyke-users 
>>  wrote:
>> 
>> We had an incident where both our local caching name servers stopped 
>> working. They returned SERVFAIL (see example below). They were set 
>> as the "dns-server-ip-primary" and our host-provided DNS server was 
>> set as the "dns-server-ip". Because the primaries were failing, I 
>> would expect spamdyke to automatically switch to resolve via the 
>> server set under "dns-server-ip". Instead, spamdyke just rejected 
>> all our mail for a few hours with DENIED_RDNS_MISSING. The 
>> host-provide name server was functioning fine.
>> 
>> This is the config:
>> 
>>dns-server-ip-primary=127.0.0.1# Local caching name server
>>dns-server-ip-primary=10.128.0.9 # Another local caching name server
>>dns-server-ip=169.254.169.254# Host-provided name server
>> 
>> This is an example response from a query to either of the primary 
>> DNS servers:
>> 
>>{q@oak3~} dig @10.128.0.9 apple.com mx
>> 
>>; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.el6_10.1 <<>> 
>> @10.128.0.9 apple.com mx
>>; (1 server found)
>>;; global options: +cmd
>>;; Got answer:
>>;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 52266
>>;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
>> 
>>;; QUESTION SECTION:
>>;apple.com. IN  MX
>> 
>>;; Query time: 15 msec
>>;; SERVER: 10.128.0.9#53(10.128.0.9)
>>;; WHEN: Mon Mar 11 05:10:32 2019
>>;; MSG SIZE  rcvd: 27
>> 
>> Am I wrong to expect spamdyke to fail over to the non-primary server 
>> on a SERVFAIL?
>> 
>> Quinn
>> ___
>> spamdyke-users mailing list
>> spamdyke-users@spamdyke.org
>> https://spamdyke.org/mailman/listinfo/spamdyke-users
> 
> ___
> spamdyke-users mailing 

Re: [spamdyke-users] SERVFAIL on dns-server-ip-primary does not fail-over

2019-03-13 Thread Sam Clippinger via spamdyke-users
Sorry, I missed your earlier email.  I'll try to answer both questions here.

Unless you're setting spamdyke's dns-level option, it should be using the 
primary servers in order, followed by the secondary servers in order, every 
time it runs.  If you're just setting the three DNS servers and not using any 
other dns-* options, the logic should look like this:
Total DNS query time is 30 seconds (override with dns-timeout-secs)
Max number of DNS queries to primary servers before using secondaries 
is 1 (override with dns-max-retries-primary)
Max number of DNS queries total is 3 (override with 
dns-max-retries-total)
Send query packet to 127.0.0.1, wait 10 seconds for a response (total 
query time divided by max number of queries)
If a response is received, use it and stop.
Send query packet to 10.128.0.9, wait 10 seconds for a response
If a response is received, use it and stop.
The number of queries to primary servers is greater than 1, start using 
secondaries as well
Send query packet to 169.254.169.254, wait 10 seconds for a response
If a response is received, use it.  Otherwise exit with no response.
Randomizing the order of the servers would probably be a good idea (or 
option) I think I didn't do that because I was trying to imitate the 
behavior of the system resolver library, which uses the servers in 
/etc/resolv.conf in order every time.

Looking at the code in dns.c, spamdyke treats an empty response as "not found" 
and doesn't check whether it was due to SERVFAIL or NXDOMAIN.  If memory 
serves, I did this because there's no real difference between them as far as 
spamdyke is concerned.  In other words, NXDOMAIN means the domain doesn't exist 
at all while SERVFAIL means the domain exists but no records can be found 
(usually because the authoritative servers aren't responding).  Either way, the 
mail should be rejected with a temporary code so the sender will try again 
later (hoping the problem will resolve itself in the meantime).  If the problem 
persists long enough, the message(s) may bounce.  Unfortunately there's no DNS 
code to indicate the server is malfunctioning and shouldn't be used -- spamdyke 
expects it to stop sending responses when that happens.


-- Sam Clippinger




> On Mar 11, 2019, at 6:58 PM, Quinn Comendant via spamdyke-users 
>  wrote:
> 
> We had an incident where both our local caching name servers stopped working. 
> They returned SERVFAIL (see example below). They were set as the 
> "dns-server-ip-primary" and our host-provided DNS server was set as the 
> "dns-server-ip". Because the primaries were failing, I would expect spamdyke 
> to automatically switch to resolve via the server set under "dns-server-ip". 
> Instead, spamdyke just rejected all our mail for a few hours with 
> DENIED_RDNS_MISSING. The host-provide name server was functioning fine.
> 
> This is the config:
> 
>dns-server-ip-primary=127.0.0.1# Local caching name server
>dns-server-ip-primary=10.128.0.9 # Another local caching name server
>dns-server-ip=169.254.169.254# Host-provided name server
> 
> This is an example response from a query to either of the primary DNS servers:
> 
>{q@oak3~} dig @10.128.0.9 apple.com mx
> 
>; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.el6_10.1 <<>> @10.128.0.9 
> apple.com mx
>; (1 server found)
>;; global options: +cmd
>;; Got answer:
>;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 52266
>;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
> 
>;; QUESTION SECTION:
>;apple.com. IN  MX
> 
>;; Query time: 15 msec
>;; SERVER: 10.128.0.9#53(10.128.0.9)
>;; WHEN: Mon Mar 11 05:10:32 2019
>;; MSG SIZE  rcvd: 27
> 
> Am I wrong to expect spamdyke to fail over to the non-primary server on a 
> SERVFAIL?
> 
> Quinn
> ___
> spamdyke-users mailing list
> spamdyke-users@spamdyke.org
> https://spamdyke.org/mailman/listinfo/spamdyke-users

___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
https://spamdyke.org/mailman/listinfo/spamdyke-users