I've determined that getaddrinfo is thread-safe on Mac OS 10.5+ and 
submitted a patch to disable the getaddrinfo lock on those systems:

http://bugs.python.org/issue25924

On Wednesday, December 16, 2015 at 7:25:19 PM UTC-5, Guido van Rossum wrote:
>
> Thanks a bundle, Jesse!
>
> On Wed, Dec 16, 2015 at 4:24 PM, A. Jesse Jiryu Davis <
> [email protected] <javascript:>> wrote:
>
>> Committed here:
>>
>>
>> https://github.com/python/asyncio/commit/39c135baf73762830148236da622787052efba19
>>
>> Yury would you please update the CPython repo when the time is right?
>>
>>
>> On Wednesday, December 9, 2015 at 6:05:46 PM UTC-5, A. Jesse Jiryu Davis 
>> wrote:
>>>
>>> Done:
>>>
>>> https://github.com/python/asyncio/pull/302
>>>
>>> On Wednesday, December 9, 2015 at 2:58:07 PM UTC-5, Guido van Rossum 
>>> wrote:
>>>>
>>>> It'll go much quicker if you send a PR to the asyncio github project. 
>>>> Thanks!
>>>>
>>>> --Guido (mobile)
>>>> On Dec 9, 2015 11:56 AM, "A. Jesse Jiryu Davis" <[email protected]> 
>>>> wrote:
>>>>
>>>>> Thanks for your kind words about the crawler chapter. =) I think we 
>>>>> didn't hit this problem there because the crawler only looked up one 
>>>>> domain, or a few of them. On Mac, subsequent calls are cached at the OS 
>>>>> layer for a few minutes, so getaddrinfo is very fast, even though it's 
>>>>> serialized by the getaddrinfo lock. Besides I don't think the crawler has 
>>>>> a 
>>>>> connection timeout, so if it *were *looking up hundreds of domains it 
>>>>> would wait as long as necessary for the lock.
>>>>>
>>>>> In this bug report, on the other hand, there are hundreds of 
>>>>> coroutines all waiting to look up different domains, and some of those 
>>>>> domains take several seconds to resolve. Resolving hundreds of different 
>>>>> domains, plus the getaddrinfo lock, plus Motor's 20-second connection 
>>>>> timeout, all combine to create this bad behavior.
>>>>>
>>>>> I like option #4 also; that gives me the freedom to either cache 
>>>>> lookups in Motor, or to treat "localhost" specially, or at least to 
>>>>> prevent 
>>>>> a slow getaddrinfo from appearing to be a connection timeout. I haven't 
>>>>> decided which is the best approach, but #4 allows me flexibility. Would 
>>>>> you 
>>>>> like me to write the patch or will you?
>>>>>
>>>>> I'm curious about #5 too, but I think that's a longer term project.
>>>>>
>>>>> On Wednesday, December 9, 2015 at 12:57:53 PM UTC-5, Guido van Rossum 
>>>>> wrote:
>>>>>>
>>>>>> 4. Modify asyncio's getaddrinfo() so that if you pass it something 
>>>>>> that looks like a numerical address (IPv4 or IPv6 syntax, looking 
>>>>>> exactly 
>>>>>> like what getaddrinfo() returns) it skips calling socket.getaddrinfo(). 
>>>>>> I've wanted this for a while but hadn't run into this queue issue, so 
>>>>>> this 
>>>>>> is my favorite.
>>>>>>
>>>>>> 5. Do the research needed to prove that socket.getaddrinfo() on OS X 
>>>>>> (or perhaps on sufficiently recent versions of OS X) is thread-safe and 
>>>>>> submit a patch that avoids the getaddrinfo lock on those versions of OS 
>>>>>> X.
>>>>>>
>>>>>> FWIW, IIRC my crawler example (which your editing made so much 
>>>>>> better!) calls getaddrinfo() for every connection and I hadn't 
>>>>>> experienced 
>>>>>> any slowness there. But maybe in the Motor example you're hitting a slow 
>>>>>> DNS server? (I'm guessing there are lots of system configuration 
>>>>>> parameters 
>>>>>> that may make the system's getaddrinfo() slower or faster, and in your 
>>>>>> setup it may well be slower.)
>>>>>>
>>>>>> On Wed, Dec 9, 2015 at 7:05 AM, A. Jesse Jiryu Davis <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Thanks Guido, this all makes sense.
>>>>>>>
>>>>>>> One problem, though, is that even if I call getaddrinfo myself in 
>>>>>>> Motor and wrap a cache around it, I *still* can't use the event 
>>>>>>> loop's create_connection() call. create_connection always 
>>>>>>> calls getaddrinfo, and even though it "should be quick", that doesn't 
>>>>>>> matter in this scenario: the time is spent waiting for the getaddrinfo 
>>>>>>> lock. So I would need one of:
>>>>>>>
>>>>>>> 1. Copy the body of create_connection into Motor so I can customize 
>>>>>>> the getaddrinfo call. I especially don't like this because I'd have to 
>>>>>>> call 
>>>>>>> the loop's private _create_connection_transport() from Motor's 
>>>>>>> customized create_connection(). Perhaps we could 
>>>>>>> make _create_connection_transport public, or otherwise make a public 
>>>>>>> API 
>>>>>>> for separating getaddrinfo from actually establishing the connection?
>>>>>>>
>>>>>>> 2. Make getaddrinfo customizable in asyncio (
>>>>>>> https://github.com/python/asyncio/issues/160). This isn't ideal, 
>>>>>>> since it requires Motor users on Mac / BSD to change configuration for 
>>>>>>> the 
>>>>>>> whole event loop just so Motor's specific create_connection calls 
>>>>>>> behave 
>>>>>>> correctly.
>>>>>>>
>>>>>>> 3. Back to the original proposal: add a connection timeout parameter 
>>>>>>> to create_connection. =)
>>>>>>>
>>>>>>> On Tuesday, December 8, 2015 at 4:30:04 PM UTC-5, Guido van Rossum 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Tue, Dec 8, 2015 at 7:13 AM, A. Jesse Jiryu Davis <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi, a Motor user began an interesting discussion on the 
>>>>>>>>> MongoDB-user list:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://groups.google.com/d/topic/mongodb-user/2oK6C3BrVKI/discussion
>>>>>>>>>
>>>>>>>>> The summary is this: he's fetching hundreds of URLs concurrently 
>>>>>>>>> and inserting the results into MongoDB with Motor. Motor throws lots 
>>>>>>>>> of 
>>>>>>>>> connection-timeout errors. The problem is getaddrinfo: on Mac, Python 
>>>>>>>>> only 
>>>>>>>>> allows one getaddrinfo call at a time. With hundreds of HTTP fetches 
>>>>>>>>> in 
>>>>>>>>> progress, there's a long queue waiting for the getaddrinfo lock. 
>>>>>>>>> Whenever 
>>>>>>>>> Motor wants to grow its connection pool it has to call getaddrinfo on 
>>>>>>>>> "localhost", and it spends so long waiting for that call, it times 
>>>>>>>>> out and 
>>>>>>>>> thinks it can't reach MongoDB.
>>>>>>>>>
>>>>>>>>
>>>>>>>> If it's really looking up "localhost" over and over, maybe wrap a 
>>>>>>>> cache around getaddrinfo()?
>>>>>>>>  
>>>>>>>>
>>>>>>>>> Motor's connection-timeout implementation in asyncio is sort of 
>>>>>>>>> wrong:
>>>>>>>>>
>>>>>>>>>     coro = asyncio.open_connection(host, port)
>>>>>>>>>     sock = yield from asyncio.wait_for(coro, timeout)
>>>>>>>>>
>>>>>>>>> The timer runs during the call to getaddrinfo, as well as the call 
>>>>>>>>> to the loop's sock_connect(). This isn't the intention: the timeout 
>>>>>>>>> should 
>>>>>>>>> apply only to the connection.
>>>>>>>>>
>>>>>>>>> A philosophical digression: The "connection timeout" is a 
>>>>>>>>> heuristic. "If I've waited N seconds and haven't established the 
>>>>>>>>> connection, I probably never will. Give up." Based on what they know 
>>>>>>>>> about 
>>>>>>>>> their own networks, users can tweak the connection timeout. In a fast 
>>>>>>>>> network, a server that hasn't responded in 20ms is probably down; but 
>>>>>>>>> on a 
>>>>>>>>> global network, 10 seconds might be reasonable. Regardless, the 
>>>>>>>>> heuristic 
>>>>>>>>> only applies to the actual TCP connection. Waiting for getaddrinfo is 
>>>>>>>>> not 
>>>>>>>>> related; that's up to the operating system.
>>>>>>>>>
>>>>>>>>> In a multithreaded client like PyMongo we distinguish the two 
>>>>>>>>> phases:
>>>>>>>>>
>>>>>>>>>     for res in socket.getaddrinfo(host, port, family, 
>>>>>>>>> socket.SOCK_STREAM):
>>>>>>>>>         af, socktype, proto, dummy, sa = res
>>>>>>>>>         sock = socket.socket(af, socktype, proto)
>>>>>>>>>         try:
>>>>>>>>>             sock.settimeout(connect_timeout)
>>>>>>>>>             
>>>>>>>>>             # THE TIMEOUT ONLY APPLIES HERE.
>>>>>>>>>             sock.connect(sa)
>>>>>>>>>             sock.settimeout(None)
>>>>>>>>>             return sock
>>>>>>>>>         except socket.error as e:
>>>>>>>>>             # Connection refused, or not established within the 
>>>>>>>>> timeout.
>>>>>>>>>             sock.close()
>>>>>>>>>
>>>>>>>>> Here, the call to getaddrinfo isn't timed at all, and each 
>>>>>>>>> distinct attempt to connect on a different address is timed 
>>>>>>>>> separately. So 
>>>>>>>>> this kind of code matches the idea of a "connect timeout" as a 
>>>>>>>>> heuristic 
>>>>>>>>> for deciding whether the server is down.
>>>>>>>>>
>>>>>>>>> Two questions:
>>>>>>>>>
>>>>>>>>> 1. Should asyncio.open_connection support a connection timeout 
>>>>>>>>> that acts like the blocking version above? That is, a connection 
>>>>>>>>> timeout 
>>>>>>>>> that does not include getaddrinfo, and restarts for each address we 
>>>>>>>>> attempt 
>>>>>>>>> to connect to?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Hm, I don't really like adding timeouts to every API. As you 
>>>>>>>> describe everyone has different needs. IMO if you don't want the 
>>>>>>>> timeout to 
>>>>>>>> cover the getaddrinfo() call, call getaddrinfo() yourself and pass the 
>>>>>>>> host 
>>>>>>>> address into the create_connection() call. That way you also have 
>>>>>>>> control 
>>>>>>>> over whether to e.g. implement "happy eyeballs". (It will still call 
>>>>>>>> socket.getaddrinfo(), but it should be quick -- it's not going to a 
>>>>>>>> DNS 
>>>>>>>> server or even /etc/hosts to discover that 127.0.0.1 maps to 
>>>>>>>> 127.0.0.1.)
>>>>>>>>  
>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2. Why does Python lock around getaddrinfo on Mac and Windows 
>>>>>>>>> anyway? The code comment says these are "systems on which 
>>>>>>>>> getaddrinfo() is 
>>>>>>>>> believed to not be thread-safe". Has this belief ever been confirmed?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://hg.python.org/cpython/file/d2b8354e87f5/Modules/socketmodule.c#l185
>>>>>>>>>
>>>>>>>>
>>>>>>>> I don't know -- the list of ifdefs seems to indicate this is a 
>>>>>>>> generic BSD issue, which is OS X's heritage. Maybe someone can do an 
>>>>>>>> experiment, or review the source code used by Apple (if it's still 
>>>>>>>> open 
>>>>>>>> source)? While I agree that if this really isn't an issue we shouldn't 
>>>>>>>> bother with the lock, I'd also much rather be safe than sorry when it 
>>>>>>>> comes 
>>>>>>>> to races in core Python.
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> --Guido van Rossum (python.org/~guido)
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> --Guido van Rossum (python.org/~guido)
>>>>>>
>>>>>
>
>
> -- 
> --Guido van Rossum (python.org/~guido)
>

Reply via email to