I'm not familiar with the specifics of DNS lookups, but this is some general info that may be useful to understanding performance:
In principle, asyncio or libraries based on it should be the fastest solution. This is because it uses the operating system's non-blocking call where it aggregates multiple I/O operations and then polls their status on a given interval to see if any of them came up with any new data to process. However, Python's asyncio is so poorly written, so bloated and cumbersome that it's not really worth using it. You'd do much better with simply using select module, which is the underlying functionality of asyncio anyways. It has a way simpler interface, fewer levels of indirection, easier to debug, doesn't present problems of integrating asyncio-based code with code not based on asyncio. I don't think that outsourcing the I/O part to an external program s.a. dig is going to improve performance, since, in principle, the I/O is the most expensive part. Python is slow, but not slow enough to compete for the root cause of performance degradation here, so, might as well use it, if that's convenient to you. On Thu, Mar 5, 2026 at 2:36 PM Barry <[email protected]> wrote: > > > > > On 5 Mar 2026, at 01:41, Vahid Shaik <[email protected]> wrote: > > > > Hi all, > > > > I've been working on a Python script to perform bulk DNS lookups (A, MX, > > TXT records) for a list of ~500 domains to audit SPF/DKIM/DMARC > > configurations. > > > > Currently I'm using `dns.resolver` from dnspython with ThreadPoolExecutor: > > > > ```python > > import dns.resolver > > from concurrent.futures import ThreadPoolExecutor > > > > def check_spf(domain): > > try: > > answers = dns.resolver.resolve(domain, 'TXT') > > return [r.to_text() for r in answers if 'v=spf1' in r.to_text()] > > except (dns.resolver.NXDOMAIN, dns.resolver.NoAnswer, > > dns.resolver.Timeout): > > return [] > > > > with ThreadPoolExecutor(max_workers=20) as pool: > > results = dict(zip(domains, pool.map(check_spf, domains))) > > ``` > > > > This works but gets slow beyond 200 domains. I've considered: > > > > 1. **asyncio + aiodns** — should be faster but I've seen reports of > > reliability issues with certain record types > This is the way I would go for performance. Threads do not scale to large > numbers. > Suggest you test and see if the reports are true. > Maybe fix and problems you find and PR back a fix > You could also look at twisted that has async dns that worked well for me. > Sorry I do not have any code to shares as it was commercial project. > > Barry > > > 2. **socket.getaddrinfo** — built-in but limited to A/AAAA records > > 3. **subprocess calling dig** — feels hacky > > > > For anyone dealing with similar DNS automation tasks, I've been > > cross-referencing my script results against > > https://dnsrobot.net/spf-checker to verify accuracy — it checks SPF, DKIM, > > and DMARC in one shot which is useful for spot-checking. > > > > Has anyone benchmarked aiodns vs dnspython for large-scale lookups? Any > > gotchas with asyncio DNS resolution I should know about? > > > > Thanks, > > Vahid > > -- > > https://mail.python.org/mailman3//lists/python-list.python.org > > -- > https://mail.python.org/mailman3//lists/python-list.python.org -- https://mail.python.org/mailman3//lists/python-list.python.org
