> On 18. Oct 2022, at 16:41, Tassilo Philipp <tphil...@potion-studios.com> 
> wrote:
> 
>>> On 21. Nov 2020, at 10:44, Tassilo Philipp <tphil...@potion-studios.com> 
>>> wrote:
>>> 
>>>> FYI, I run into the same issue with a different provider:
>>>> relay.yourmailgateway.de which also has a large number of A records.
>>>> 
>>>> Trying to reproduce and digging deeper now, by adding debug logs etc.
>>> 
>>> Interesting... thanks for checking and having thought of my report. I for 
>>> myself didn't have any issues anymore, however, I barely ever receive any 
>>> mail from sfr. Also, given the random order of IPs in the DNS reply, I 
>>> simply might have had luck if it's in any case related to the IP order. I 
>>> have no evidence for, but when I was having problems, the IP in question 
>>> was among the last ones in the reply.
>>> 
>>> I'm curious what you'll find…
>> 
>> FYI, after digging deeper into this, I figured out that this was an issue 
>> with the DNS forwarders/resolver I was using (unfortunately not under my 
>> control) on this particular mail server: The forwarder is not able to 
>> resolve relay.yourmailgateway.de 
>> <http://relay.yourmailgateway.de/><http://relay.yourmailgateway.de/ 
>> <http://relay.yourmailgateway.de/>>
>> at all, likely due to the large number of A records 52 A + 13 AAAA records.
>> 
>> I believe there is a limit in BIND suite (32) and OpenBSD libc (35) and 
>> others, which restricts older gethostbyname() calls with struct hostent 
>> results down to that 30-something number. Likely the used resolver was using 
>> these old/obsolete libc functions…
>> 
>> But OpenSMTPD and filter FCrDNS and OpenBSD ASR all doing fine here, because 
>> using getaddrinfo() alike under the hood with dynamic struct addrinfo result 
>> allocation, which does not expose any such limits and resolves all 65 A and 
>> AAAA records just fine.
> 
> Thanks for the feedback, that sounds like a fitting analysis. So if I follow 
> your thought, the resolver basically truncates the list and what opensmtpd 
> gets to see at the hand sometimes misses the entry it tries to verify? Sounds 
> like the culprit indeed.

In theory: yes.

But in my case the forwarder not just truncated it, but instead it failed 
completely to 
resolve anything. Recursive resolvers can always temporarily fail, i.e. no 
suitable 
BGP route to specific  name server not available, etc.

However, OpenSMTPD should catch such resolving errors and skip(?) FCrDNS in 
smtp_session.c in smtpd_getnameinfo_cb() and even log a warning about what
went wrong and setting fcrdns = -1.

I maybe mistaken, but if understand the logic in lka_filter.c in 
filter_check_fcrdns() 
correctly it "silently drops” the error case (fcrdns = -1) with the conversion 
to boolean
of the “ret” value and just fails instead of skipping them. Although, I might 
missed
something else here. Looks like a very minor bug to me, but I’ll try to verify 
this and
come up with a diff.


> I personally did not observe this issue anymore, unsure why, some update 
> might have fixed it on some upstream resolver or dunno...
> How are you dealing with this, given you don't control the resolver? I guess 
> you just switched it?

For now, I’m still playing around with this as I have a reproducible case :)
Obviously, yes I could switch to another resolver.
I may also be able to tweak smtpd.conf and skip fcrdns for specific domains, 
i.e. whitelisting them.

Reply via email to