On Tue, Dec 8, 2015 at 7:13 AM, A. Jesse Jiryu Davis <[email protected]>
wrote:

> Hi, a Motor user began an interesting discussion on the MongoDB-user list:
>
> https://groups.google.com/d/topic/mongodb-user/2oK6C3BrVKI/discussion
>
> The summary is this: he's fetching hundreds of URLs concurrently and
> inserting the results into MongoDB with Motor. Motor throws lots of
> connection-timeout errors. The problem is getaddrinfo: on Mac, Python only
> allows one getaddrinfo call at a time. With hundreds of HTTP fetches in
> progress, there's a long queue waiting for the getaddrinfo lock. Whenever
> Motor wants to grow its connection pool it has to call getaddrinfo on
> "localhost", and it spends so long waiting for that call, it times out and
> thinks it can't reach MongoDB.
>

If it's really looking up "localhost" over and over, maybe wrap a cache
around getaddrinfo()?


> Motor's connection-timeout implementation in asyncio is sort of wrong:
>
>     coro = asyncio.open_connection(host, port)
>     sock = yield from asyncio.wait_for(coro, timeout)
>
> The timer runs during the call to getaddrinfo, as well as the call to the
> loop's sock_connect(). This isn't the intention: the timeout should apply
> only to the connection.
>
> A philosophical digression: The "connection timeout" is a heuristic. "If
> I've waited N seconds and haven't established the connection, I probably
> never will. Give up." Based on what they know about their own networks,
> users can tweak the connection timeout. In a fast network, a server that
> hasn't responded in 20ms is probably down; but on a global network, 10
> seconds might be reasonable. Regardless, the heuristic only applies to the
> actual TCP connection. Waiting for getaddrinfo is not related; that's up to
> the operating system.
>
> In a multithreaded client like PyMongo we distinguish the two phases:
>
>     for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
>         af, socktype, proto, dummy, sa = res
>         sock = socket.socket(af, socktype, proto)
>         try:
>             sock.settimeout(connect_timeout)
>
>             # THE TIMEOUT ONLY APPLIES HERE.
>             sock.connect(sa)
>             sock.settimeout(None)
>             return sock
>         except socket.error as e:
>             # Connection refused, or not established within the timeout.
>             sock.close()
>
> Here, the call to getaddrinfo isn't timed at all, and each distinct
> attempt to connect on a different address is timed separately. So this kind
> of code matches the idea of a "connect timeout" as a heuristic for deciding
> whether the server is down.
>
> Two questions:
>
> 1. Should asyncio.open_connection support a connection timeout that acts
> like the blocking version above? That is, a connection timeout that does
> not include getaddrinfo, and restarts for each address we attempt to
> connect to?
>

Hm, I don't really like adding timeouts to every API. As you describe
everyone has different needs. IMO if you don't want the timeout to cover
the getaddrinfo() call, call getaddrinfo() yourself and pass the host
address into the create_connection() call. That way you also have control
over whether to e.g. implement "happy eyeballs". (It will still call
socket.getaddrinfo(), but it should be quick -- it's not going to a DNS
server or even /etc/hosts to discover that 127.0.0.1 maps to 127.0.0.1.)


>
> 2. Why does Python lock around getaddrinfo on Mac and Windows anyway? The
> code comment says these are "systems on which getaddrinfo() is believed to
> not be thread-safe". Has this belief ever been confirmed?
>
> https://hg.python.org/cpython/file/d2b8354e87f5/Modules/socketmodule.c#l185
>

I don't know -- the list of ifdefs seems to indicate this is a generic BSD
issue, which is OS X's heritage. Maybe someone can do an experiment, or
review the source code used by Apple (if it's still open source)? While I
agree that if this really isn't an issue we shouldn't bother with the lock,
I'd also much rather be safe than sorry when it comes to races in core
Python.

-- 
--Guido van Rossum (python.org/~guido)

Reply via email to