Committed here: https://github.com/python/asyncio/commit/39c135baf73762830148236da622787052efba19
Yury would you please update the CPython repo when the time is right? On Wednesday, December 9, 2015 at 6:05:46 PM UTC-5, A. Jesse Jiryu Davis wrote: > > Done: > > https://github.com/python/asyncio/pull/302 > > On Wednesday, December 9, 2015 at 2:58:07 PM UTC-5, Guido van Rossum wrote: >> >> It'll go much quicker if you send a PR to the asyncio github project. >> Thanks! >> >> --Guido (mobile) >> On Dec 9, 2015 11:56 AM, "A. Jesse Jiryu Davis" <[email protected]> >> wrote: >> >>> Thanks for your kind words about the crawler chapter. =) I think we >>> didn't hit this problem there because the crawler only looked up one >>> domain, or a few of them. On Mac, subsequent calls are cached at the OS >>> layer for a few minutes, so getaddrinfo is very fast, even though it's >>> serialized by the getaddrinfo lock. Besides I don't think the crawler has a >>> connection timeout, so if it *were *looking up hundreds of domains it >>> would wait as long as necessary for the lock. >>> >>> In this bug report, on the other hand, there are hundreds of coroutines >>> all waiting to look up different domains, and some of those domains take >>> several seconds to resolve. Resolving hundreds of different domains, plus >>> the getaddrinfo lock, plus Motor's 20-second connection timeout, all >>> combine to create this bad behavior. >>> >>> I like option #4 also; that gives me the freedom to either cache lookups >>> in Motor, or to treat "localhost" specially, or at least to prevent a slow >>> getaddrinfo from appearing to be a connection timeout. I haven't decided >>> which is the best approach, but #4 allows me flexibility. Would you like me >>> to write the patch or will you? >>> >>> I'm curious about #5 too, but I think that's a longer term project. >>> >>> On Wednesday, December 9, 2015 at 12:57:53 PM UTC-5, Guido van Rossum >>> wrote: >>>> >>>> 4. Modify asyncio's getaddrinfo() so that if you pass it something that >>>> looks like a numerical address (IPv4 or IPv6 syntax, looking exactly like >>>> what getaddrinfo() returns) it skips calling socket.getaddrinfo(). I've >>>> wanted this for a while but hadn't run into this queue issue, so this is >>>> my >>>> favorite. >>>> >>>> 5. Do the research needed to prove that socket.getaddrinfo() on OS X >>>> (or perhaps on sufficiently recent versions of OS X) is thread-safe and >>>> submit a patch that avoids the getaddrinfo lock on those versions of OS X. >>>> >>>> FWIW, IIRC my crawler example (which your editing made so much better!) >>>> calls getaddrinfo() for every connection and I hadn't experienced any >>>> slowness there. But maybe in the Motor example you're hitting a slow DNS >>>> server? (I'm guessing there are lots of system configuration parameters >>>> that may make the system's getaddrinfo() slower or faster, and in your >>>> setup it may well be slower.) >>>> >>>> On Wed, Dec 9, 2015 at 7:05 AM, A. Jesse Jiryu Davis < >>>> [email protected]> wrote: >>>> >>>>> Thanks Guido, this all makes sense. >>>>> >>>>> One problem, though, is that even if I call getaddrinfo myself in >>>>> Motor and wrap a cache around it, I *still* can't use the event >>>>> loop's create_connection() call. create_connection always >>>>> calls getaddrinfo, and even though it "should be quick", that doesn't >>>>> matter in this scenario: the time is spent waiting for the getaddrinfo >>>>> lock. So I would need one of: >>>>> >>>>> 1. Copy the body of create_connection into Motor so I can customize >>>>> the getaddrinfo call. I especially don't like this because I'd have to >>>>> call >>>>> the loop's private _create_connection_transport() from Motor's >>>>> customized create_connection(). Perhaps we could >>>>> make _create_connection_transport public, or otherwise make a public API >>>>> for separating getaddrinfo from actually establishing the connection? >>>>> >>>>> 2. Make getaddrinfo customizable in asyncio ( >>>>> https://github.com/python/asyncio/issues/160). This isn't ideal, >>>>> since it requires Motor users on Mac / BSD to change configuration for >>>>> the >>>>> whole event loop just so Motor's specific create_connection calls behave >>>>> correctly. >>>>> >>>>> 3. Back to the original proposal: add a connection timeout parameter >>>>> to create_connection. =) >>>>> >>>>> On Tuesday, December 8, 2015 at 4:30:04 PM UTC-5, Guido van Rossum >>>>> wrote: >>>>> >>>>>> On Tue, Dec 8, 2015 at 7:13 AM, A. Jesse Jiryu Davis < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi, a Motor user began an interesting discussion on the MongoDB-user >>>>>>> list: >>>>>>> >>>>>>> https://groups.google.com/d/topic/mongodb-user/2oK6C3BrVKI/discussion >>>>>>> >>>>>>> The summary is this: he's fetching hundreds of URLs concurrently and >>>>>>> inserting the results into MongoDB with Motor. Motor throws lots of >>>>>>> connection-timeout errors. The problem is getaddrinfo: on Mac, Python >>>>>>> only >>>>>>> allows one getaddrinfo call at a time. With hundreds of HTTP fetches in >>>>>>> progress, there's a long queue waiting for the getaddrinfo lock. >>>>>>> Whenever >>>>>>> Motor wants to grow its connection pool it has to call getaddrinfo on >>>>>>> "localhost", and it spends so long waiting for that call, it times out >>>>>>> and >>>>>>> thinks it can't reach MongoDB. >>>>>>> >>>>>> >>>>>> If it's really looking up "localhost" over and over, maybe wrap a >>>>>> cache around getaddrinfo()? >>>>>> >>>>>> >>>>>>> Motor's connection-timeout implementation in asyncio is sort of >>>>>>> wrong: >>>>>>> >>>>>>> coro = asyncio.open_connection(host, port) >>>>>>> sock = yield from asyncio.wait_for(coro, timeout) >>>>>>> >>>>>>> The timer runs during the call to getaddrinfo, as well as the call >>>>>>> to the loop's sock_connect(). This isn't the intention: the timeout >>>>>>> should >>>>>>> apply only to the connection. >>>>>>> >>>>>>> A philosophical digression: The "connection timeout" is a heuristic. >>>>>>> "If I've waited N seconds and haven't established the connection, I >>>>>>> probably never will. Give up." Based on what they know about their own >>>>>>> networks, users can tweak the connection timeout. In a fast network, a >>>>>>> server that hasn't responded in 20ms is probably down; but on a global >>>>>>> network, 10 seconds might be reasonable. Regardless, the heuristic only >>>>>>> applies to the actual TCP connection. Waiting for getaddrinfo is not >>>>>>> related; that's up to the operating system. >>>>>>> >>>>>>> In a multithreaded client like PyMongo we distinguish the two phases: >>>>>>> >>>>>>> for res in socket.getaddrinfo(host, port, family, >>>>>>> socket.SOCK_STREAM): >>>>>>> af, socktype, proto, dummy, sa = res >>>>>>> sock = socket.socket(af, socktype, proto) >>>>>>> try: >>>>>>> sock.settimeout(connect_timeout) >>>>>>> >>>>>>> # THE TIMEOUT ONLY APPLIES HERE. >>>>>>> sock.connect(sa) >>>>>>> sock.settimeout(None) >>>>>>> return sock >>>>>>> except socket.error as e: >>>>>>> # Connection refused, or not established within the >>>>>>> timeout. >>>>>>> sock.close() >>>>>>> >>>>>>> Here, the call to getaddrinfo isn't timed at all, and each distinct >>>>>>> attempt to connect on a different address is timed separately. So this >>>>>>> kind >>>>>>> of code matches the idea of a "connect timeout" as a heuristic for >>>>>>> deciding >>>>>>> whether the server is down. >>>>>>> >>>>>>> Two questions: >>>>>>> >>>>>>> 1. Should asyncio.open_connection support a connection timeout that >>>>>>> acts like the blocking version above? That is, a connection timeout >>>>>>> that >>>>>>> does not include getaddrinfo, and restarts for each address we attempt >>>>>>> to >>>>>>> connect to? >>>>>>> >>>>>> >>>>>> Hm, I don't really like adding timeouts to every API. As you describe >>>>>> everyone has different needs. IMO if you don't want the timeout to cover >>>>>> the getaddrinfo() call, call getaddrinfo() yourself and pass the host >>>>>> address into the create_connection() call. That way you also have >>>>>> control >>>>>> over whether to e.g. implement "happy eyeballs". (It will still call >>>>>> socket.getaddrinfo(), but it should be quick -- it's not going to a DNS >>>>>> server or even /etc/hosts to discover that 127.0.0.1 maps to 127.0.0.1.) >>>>>> >>>>>> >>>>>>> >>>>>>> 2. Why does Python lock around getaddrinfo on Mac and Windows >>>>>>> anyway? The code comment says these are "systems on which getaddrinfo() >>>>>>> is >>>>>>> believed to not be thread-safe". Has this belief ever been confirmed? >>>>>>> >>>>>>> >>>>>>> https://hg.python.org/cpython/file/d2b8354e87f5/Modules/socketmodule.c#l185 >>>>>>> >>>>>> >>>>>> I don't know -- the list of ifdefs seems to indicate this is a >>>>>> generic BSD issue, which is OS X's heritage. Maybe someone can do an >>>>>> experiment, or review the source code used by Apple (if it's still open >>>>>> source)? While I agree that if this really isn't an issue we shouldn't >>>>>> bother with the lock, I'd also much rather be safe than sorry when it >>>>>> comes >>>>>> to races in core Python. >>>>>> >>>>>> -- >>>>>> --Guido van Rossum (python.org/~guido) >>>>>> >>>>> >>>> >>>> >>>> -- >>>> --Guido van Rossum (python.org/~guido) >>>> >>>
