25 jul 2010 kl. 08.53 skrev Adrian Georgescu: > These are all valid points. And DNS is not single thing that causes this > behavior, any operation can block like radius, mysql query, and the result > is the same. > > The only feasible solution is possible with the new design that will deal > asynchronously with such events. The proxy will not wait for the DNS answers > in order to proceed with a new transaction. That's why Resiprocate use the ARES library - for asynch DNS queries.
/O > > Adrian > > On Jul 25, 2010, at 2:16 AM, Stanisław Pitucha wrote: > >> Hi all, >> >> I wanted to collect some ideas on how do you solve DNS connectivity >> problems. I've run into those issues a couple of times already and don't >> see a perfect solution so far. Maybe I can trigger some discussion: >> >> Some background: >> - opensips blocks the child process while resolving a domain / querying ENUM >> - standard resolver has minimum timeout = 1s >> - standard resolver does only one query at a time and can cycle >> nameservers, but does not save state >> I believe these are not real problems - just ugly legacy :) that we can >> work around. >> >> The implication is that if you don't use a caching nameserver on your >> side and you allow users to use routing based on a domain name (not very >> hard - do you handle "302"s, record-routes, registration?), you're >> basically screwed: >> >> 1. If you don't cache, any domain which times out will block a child for >> at least 1s. If you use retries, you block for at least Ns where N = >> number of nameservers. You can be DoS-ed with ~8 packets per second, in >> standard configuration. >> >> 2. If you cycle N nameservers and one of them is down, you're processing >> N-1 packets correctly, then block until timeout on the last one, then >> processing N-1, etc. - not good for a high-traffic proxy. >> >> 3. If you cache results, you're safe from random failures, but only if >> you cache timeouts as negative results and keep the state of servers >> being down, so you don't try to query them again. (nothing apart from >> `dnsmasq` does that, AFAIK) >> >> 4. What solves half of the problem for me, is `dnsmasq` - as far as I >> know it's the only caching dns server which allows to query all >> nameservers in parallel. I get 4 times the needed DNS traffic, but I'm >> never timing out connections if one of the servers is down. Also some >> results come from cache, so it's only 2 times the traffic in reality. >> The problem with `dnsmasq` is that it doesn't cache SRV and NAPTR >> requests (doesn't cache the timeouts / NX responses for them either), >> only A/AAAA/PTR/.... >> >> 5. So even if you have a local caching and backup resolver in >> `resolv.conf`, minimal timeout, parallel querying from the local cache, >> saving the state of upstream resolvers being down and route all internal >> traffic via IPs... it takes only one person with custom NAPTR sending >> you to custom SRV address which times out to kill all the traffic. >> >> So... what's your experience with this? Do you have some better >> protection in place? >> I'm considering adding negative caching of dns timeouts and general >> caching of SRV and NAPTR records into `dnsmasq` to complete my protection. >> Do you know of any software which would solve those problems out-of-box? >> >> Thanks, >> Stan >> >> _______________________________________________ >> Users mailing list >> [email protected] >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >> > > > _______________________________________________ > Users mailing list > [email protected] > http://lists.opensips.org/cgi-bin/mailman/listinfo/users --- * Olle E Johansson - [email protected] * Cell phone +46 70 593 68 51, Office +46 8 96 40 20, Sweden _______________________________________________ Users mailing list [email protected] http://lists.opensips.org/cgi-bin/mailman/listinfo/users
