On 23-May-22 07:21, Daniel Stenberg via curl-library wrote:
On Fri, 20 May 2022, Dmitry Karpov wrote:

I understand the rationale for keeping DNS entry in the cache for both addresses, but in my proposal, I suggest to use "dual-stack" DNS queries only for dual-stack and IPv6-only modes. This will make IPv4-only requests in IPv6-enabled libcurl builds behave the same way as they do in IPv4-only builds.

I believe that suggestion would basically revert 84d2839740ca7804, so it would need some careful considerations.

Maybe we should rather add some variation to CURLOPT_IPRESOLVE for more explicit *also applies to name resolving*? We might need to do something about caching/connection reuse too, or at least decide and document exactly how those would work in these siutations.

This seems as if the cache state is not granular enough, and a heuristic fix was attempted.  Adding more heuristics is not the right approach; it will make matters worse.

My take:

Only the IPRESOLVE-enabled DNS resolution(s) should be done.  The cache should reflect what was done, and the result(s).  Subsequent requests with a different IPRESOLVE setting may hit on the name, but if the resolution for the new request's record type is missing, the missing record type should be resolved, and the cache updated.  That is, the per-hostname cache state for each protocol type can be (unknown - no resolution attempted; or 0 - n addresses of the specified type.)

If IPRESOLVE is restricted to V4, the DNS request should only be for A records, and the cache entry should reflect A record(s) for the specified host, and that only A record state is known.  E.g. hostname, A, 0-n

If IPRESOLVE is restricted to V6, the same for AAAA.

If IPRESOLVE is unrestricted, then, and only then, the DNS request should be for both A & AAAA, and both status values and record types cached.

With the correct state, there is no confusion when consulting the cache for a subsequent request that may have a different IPRESOLVE setting.

If the new request includes V4, and the host has a cache entry marked valid for V4, the entry tells us how many A records exist. If none, don't connect using V4.  If the entry is not valid for V4, do a new lookup and cache the result.

Same for V6.

After consulting/updating the cache, if the IPRESOLVE for the curl handle doesn't produce a hit (e.g. resolve_allowed(4) && >0 A || resolve_allowed(6) && >0 AAAA), fail the request.  Otherwise, try to connect using each available address.  (Using whatever preference/parallelism scheme you like - sequential, v4-first, v6-first, happy eyes... once you have the correct state, you can do the right things.)

Similarly, for the connection cache, record the protocol type (you can also get this from the peer's address length).  When checking the cache for a reusable connection, only consider cached connections that match the handle's IPRESOLVE constraints.

You can implement the necessary state several ways.

This solves the original problem and does no extraneous (speculative) DNS lookups.  Thus it also prevents this reporter's issue of the speculative lookups timing out.

Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Reply via email to