It'll go much quicker if you send a PR to the asyncio github project. Thanks!
--Guido (mobile) On Dec 9, 2015 11:56 AM, "A. Jesse Jiryu Davis" <[email protected]> wrote: > Thanks for your kind words about the crawler chapter. =) I think we didn't > hit this problem there because the crawler only looked up one domain, or a > few of them. On Mac, subsequent calls are cached at the OS layer for a few > minutes, so getaddrinfo is very fast, even though it's serialized by the > getaddrinfo lock. Besides I don't think the crawler has a connection > timeout, so if it *were *looking up hundreds of domains it would wait as > long as necessary for the lock. > > In this bug report, on the other hand, there are hundreds of coroutines > all waiting to look up different domains, and some of those domains take > several seconds to resolve. Resolving hundreds of different domains, plus > the getaddrinfo lock, plus Motor's 20-second connection timeout, all > combine to create this bad behavior. > > I like option #4 also; that gives me the freedom to either cache lookups > in Motor, or to treat "localhost" specially, or at least to prevent a slow > getaddrinfo from appearing to be a connection timeout. I haven't decided > which is the best approach, but #4 allows me flexibility. Would you like me > to write the patch or will you? > > I'm curious about #5 too, but I think that's a longer term project. > > On Wednesday, December 9, 2015 at 12:57:53 PM UTC-5, Guido van Rossum > wrote: >> >> 4. Modify asyncio's getaddrinfo() so that if you pass it something that >> looks like a numerical address (IPv4 or IPv6 syntax, looking exactly like >> what getaddrinfo() returns) it skips calling socket.getaddrinfo(). I've >> wanted this for a while but hadn't run into this queue issue, so this is my >> favorite. >> >> 5. Do the research needed to prove that socket.getaddrinfo() on OS X (or >> perhaps on sufficiently recent versions of OS X) is thread-safe and submit >> a patch that avoids the getaddrinfo lock on those versions of OS X. >> >> FWIW, IIRC my crawler example (which your editing made so much better!) >> calls getaddrinfo() for every connection and I hadn't experienced any >> slowness there. But maybe in the Motor example you're hitting a slow DNS >> server? (I'm guessing there are lots of system configuration parameters >> that may make the system's getaddrinfo() slower or faster, and in your >> setup it may well be slower.) >> >> On Wed, Dec 9, 2015 at 7:05 AM, A. Jesse Jiryu Davis < >> [email protected]> wrote: >> >>> Thanks Guido, this all makes sense. >>> >>> One problem, though, is that even if I call getaddrinfo myself in Motor >>> and wrap a cache around it, I *still* can't use the event >>> loop's create_connection() call. create_connection always >>> calls getaddrinfo, and even though it "should be quick", that doesn't >>> matter in this scenario: the time is spent waiting for the getaddrinfo >>> lock. So I would need one of: >>> >>> 1. Copy the body of create_connection into Motor so I can customize the >>> getaddrinfo call. I especially don't like this because I'd have to call the >>> loop's private _create_connection_transport() from Motor's >>> customized create_connection(). Perhaps we could >>> make _create_connection_transport public, or otherwise make a public API >>> for separating getaddrinfo from actually establishing the connection? >>> >>> 2. Make getaddrinfo customizable in asyncio ( >>> https://github.com/python/asyncio/issues/160). This isn't ideal, since >>> it requires Motor users on Mac / BSD to change configuration for the whole >>> event loop just so Motor's specific create_connection calls behave >>> correctly. >>> >>> 3. Back to the original proposal: add a connection timeout parameter to >>> create_connection. =) >>> >>> On Tuesday, December 8, 2015 at 4:30:04 PM UTC-5, Guido van Rossum wrote: >>> >>>> On Tue, Dec 8, 2015 at 7:13 AM, A. Jesse Jiryu Davis < >>>> [email protected]> wrote: >>>> >>>>> Hi, a Motor user began an interesting discussion on the MongoDB-user >>>>> list: >>>>> >>>>> https://groups.google.com/d/topic/mongodb-user/2oK6C3BrVKI/discussion >>>>> >>>>> The summary is this: he's fetching hundreds of URLs concurrently and >>>>> inserting the results into MongoDB with Motor. Motor throws lots of >>>>> connection-timeout errors. The problem is getaddrinfo: on Mac, Python only >>>>> allows one getaddrinfo call at a time. With hundreds of HTTP fetches in >>>>> progress, there's a long queue waiting for the getaddrinfo lock. Whenever >>>>> Motor wants to grow its connection pool it has to call getaddrinfo on >>>>> "localhost", and it spends so long waiting for that call, it times out and >>>>> thinks it can't reach MongoDB. >>>>> >>>> >>>> If it's really looking up "localhost" over and over, maybe wrap a cache >>>> around getaddrinfo()? >>>> >>>> >>>>> Motor's connection-timeout implementation in asyncio is sort of wrong: >>>>> >>>>> coro = asyncio.open_connection(host, port) >>>>> sock = yield from asyncio.wait_for(coro, timeout) >>>>> >>>>> The timer runs during the call to getaddrinfo, as well as the call to >>>>> the loop's sock_connect(). This isn't the intention: the timeout should >>>>> apply only to the connection. >>>>> >>>>> A philosophical digression: The "connection timeout" is a heuristic. >>>>> "If I've waited N seconds and haven't established the connection, I >>>>> probably never will. Give up." Based on what they know about their own >>>>> networks, users can tweak the connection timeout. In a fast network, a >>>>> server that hasn't responded in 20ms is probably down; but on a global >>>>> network, 10 seconds might be reasonable. Regardless, the heuristic only >>>>> applies to the actual TCP connection. Waiting for getaddrinfo is not >>>>> related; that's up to the operating system. >>>>> >>>>> In a multithreaded client like PyMongo we distinguish the two phases: >>>>> >>>>> for res in socket.getaddrinfo(host, port, family, >>>>> socket.SOCK_STREAM): >>>>> af, socktype, proto, dummy, sa = res >>>>> sock = socket.socket(af, socktype, proto) >>>>> try: >>>>> sock.settimeout(connect_timeout) >>>>> >>>>> # THE TIMEOUT ONLY APPLIES HERE. >>>>> sock.connect(sa) >>>>> sock.settimeout(None) >>>>> return sock >>>>> except socket.error as e: >>>>> # Connection refused, or not established within the >>>>> timeout. >>>>> sock.close() >>>>> >>>>> Here, the call to getaddrinfo isn't timed at all, and each distinct >>>>> attempt to connect on a different address is timed separately. So this >>>>> kind >>>>> of code matches the idea of a "connect timeout" as a heuristic for >>>>> deciding >>>>> whether the server is down. >>>>> >>>>> Two questions: >>>>> >>>>> 1. Should asyncio.open_connection support a connection timeout that >>>>> acts like the blocking version above? That is, a connection timeout that >>>>> does not include getaddrinfo, and restarts for each address we attempt to >>>>> connect to? >>>>> >>>> >>>> Hm, I don't really like adding timeouts to every API. As you describe >>>> everyone has different needs. IMO if you don't want the timeout to cover >>>> the getaddrinfo() call, call getaddrinfo() yourself and pass the host >>>> address into the create_connection() call. That way you also have control >>>> over whether to e.g. implement "happy eyeballs". (It will still call >>>> socket.getaddrinfo(), but it should be quick -- it's not going to a DNS >>>> server or even /etc/hosts to discover that 127.0.0.1 maps to 127.0.0.1.) >>>> >>>> >>>>> >>>>> 2. Why does Python lock around getaddrinfo on Mac and Windows anyway? >>>>> The code comment says these are "systems on which getaddrinfo() is >>>>> believed >>>>> to not be thread-safe". Has this belief ever been confirmed? >>>>> >>>>> >>>>> https://hg.python.org/cpython/file/d2b8354e87f5/Modules/socketmodule.c#l185 >>>>> >>>> >>>> I don't know -- the list of ifdefs seems to indicate this is a generic >>>> BSD issue, which is OS X's heritage. Maybe someone can do an experiment, or >>>> review the source code used by Apple (if it's still open source)? While I >>>> agree that if this really isn't an issue we shouldn't bother with the lock, >>>> I'd also much rather be safe than sorry when it comes to races in core >>>> Python. >>>> >>>> -- >>>> --Guido van Rossum (python.org/~guido) >>>> >>> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> >
