** Description changed:

- The bug occurs when the use-stale-cache feature is enabled alongside
- DNSSEC validation.
+ [ Impact ]
  
- When a query response is served from cache, dnsmasq immediately returns
- it to the client. However, this bypasses the normal retry mechanism that
- dnsmasq's DNSSEC implementation depends on. Specifically, dnsmasq
- expects clients to retry truncated DNSSEC queries over TCP.
+  * Enabling 'use-stale-cache' alongside DNSSEC validation can cause repeated
+    validation attempts for every cached response if the original DNSKEY
+    response is truncated.
+ 
+  * dnsmasq normally expects clients to retry truncated queries over TCP, but
+    'use-stale-cache' immediately serves a stale response to the client, never
+    triggering a TCP retry. Consequently, DNSSEC validation continuously fails.
+ 
+  * The upstream commit [1] fixes this issue by internally moving to TCP for
+    DNSSEC queries when validating UDP answers.
+ 
+ [1]:
+ https://github.com/imp/dnsmasq/commit/f5cdb007d8845dde8e5053229f47b46b1b756473
+ 
+ [ Test Plan ]
+ 
+  * Execute the following script:
+ 
+ ```
+ dnsport=5353
+ logfile=$(mktemp)
+ 
+ trap '
+     kill $(jobs -p) 2>/dev/null
+     wait $(jobs -p) 2>/dev/null
+ ' EXIT INT TERM
+ 
+ dnsmasq \
+     --no-daemon \
+     --port $dnsport \
+     --server=8.8.8.8 \
+     --no-resolv \
+     --log-queries \
+     --log-facility=- \
+     --use-stale-cache \
+     --dnssec \
+     --max-cache-ttl=1 \
+     --edns-packet-max=512 \
+     
--trust-anchor=.,20326,8,2,E06D44B80B8F1D39A95C0B0D7C65D08458E880409BBC683457104237C7F8EC8D
 \
+     2>&1 | tee $logfile &
+ 
+ sleep 2
+ 
+ echo
+ echo '1. Without TCP retry (+ignore): DNSSEC validation FAILS'
+ dig -p $dnsport @127.0.0.1 cloudflare.com +ignore 2>&1 | grep -E 
'status:|Truncated'
+ grep 'validation result' $logfile | head -1
+ 
+ echo
+ echo '2. With TCP retry: validation succeeds'
+ dig -p $dnsport @127.0.0.1 cloudflare.com 2>&1 | grep -E 'status:|Truncated'
+ grep 'SECURE' $logfile | head -1
+ 
+ echo
+ echo '3. From cache: returns instantly (0ms), background refresh has no TCP 
retry'
+ sleep 3
+ dig -p $dnsport @127.0.0.1 cloudflare.com 2>&1 | grep -E 'Query time|EDE'
+ grep -E 'cached-stale|forwarded' $logfile | tail -2
+ 
+ echo "full dnsmasq log in $logfile"
+ ```
+ 
+  * The most important section is the step "1. Without TCP retry (+ignore) 
...".
+    The validation result should be SECURE (and not TRUNCATED).
+ 
+ [ Where problems could occur ]
+ 
+  * The patch is not trivial and had to be backported to both Jammy and Noble
+    since it didn't apply cleanly.
+ 
+  * Since the patch introduces automatically initiated TCP connections,
+    regressions would likely manifest in TCP connection handling/network 
timeout
+    handling.
+ 
+  * The new internal TCP retries could theoretically hang or time out if an
+    upstream DNS server or intermediate firewall blocks TCP.
+ 
+ [ Other Info ]
+ 
+  * Fix already included in upstream versions >= 2.91.
+ 
+ [ Original Bug Description ]
+ The bug occurs when the use-stale-cache feature is enabled alongside DNSSEC
+ validation.
+ 
+ When a query response is served from cache, dnsmasq immediately returns it to
+ the client. However, this bypasses the normal retry mechanism that dnsmasq's
+ DNSSEC implementation depends on. Specifically, dnsmasq expects clients to
+ retry truncated DNSSEC queries over TCP.
  
  When background cache refresh requires DNSSEC validation and the DNSKEY
- response exceeds 1232 bytes (which is typical for the root DNSKEY), the
- query is truncated. Since the client never retries, validation fails.
- This triggers repeated validation attempts for every cached response. It
- has resulted in fleet-wide query storm that can persist for up to 48
- hours (the TTL of the root DNSKEY).
+ response exceeds 1232 bytes (which is typical for the root DNSKEY), the query
+ is truncated. Since the client never retries, validation fails. This triggers
+ repeated validation attempts for every cached response. It has resulted in
+ fleet-wide query storm that can persist for up to 48 hours (the TTL of the 
root
+ DNSKEY).
  
- This issue is fixed in upstream commit f5cdb00, which performs the TCP
- retry internally without requiring the client to trigger it. This fix is
- included in dnsmasq 2.91 but is not present in version 2.90 currently
- available in Jammy and Focal repositories.
+ This issue is fixed in upstream commit f5cdb00, which performs the TCP retry
+ internally without requiring the client to trigger it. This fix is included in
+ dnsmasq 2.91 but is not present in version 2.90 currently available in Jammy
+ and Focal repositories.

** Changed in: dnsmasq (Ubuntu Noble)
       Status: New => In Progress

** Changed in: dnsmasq (Ubuntu Jammy)
       Status: New => In Progress

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2138412

Title:
  DNSSEC validation with stale cache enabled does not properly retry
  truncated response

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/dnsmasq/+bug/2138412/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to