On 4/1/2012 1:41 PM, John Nagle wrote:
On 4/1/2012 9:26 AM, Michael Torrie wrote:
On 03/31/2012 04:58 PM, John Nagle wrote:
Removed all "search" and "domain" entries from /etc/resolve.conf
It's a design bug in glibc. I just submitted a bug report.
http://sourceware.org/bugzilla/show_bug.cgi?id=13935
It only appears if you have a machine with a two-component domain
name ending in ".com" as the actual machine name. Most hosting
services generate some long arbitrary name as the primary name,
but I happen to have a server set up as "companyname.com".
The default rule for looking up domains in glibc is that the
"domain" is everything after the FIRST ".". Failed lookups
are retried with that "domain" appended. The idea, back
in the 1980s, was that if you're on "foo.bigcompany.com",
and look up "bar", it's looked up as "bar.bigcompany.com".
This idea backfires when the actual hostname only
has two components, and the search just appends ".com".
There is a "com.com" domain, and this gets them traffic.
They exploit this to send you (where else) to an ad-heavy page.
Try "python.com.com", for example,and you'll get an ad for a
Java database.
The workaround in Python is to add the AI_CANONNAME flag
to getaddrinfo calls, then check that the returned domain
name matches the one put in.
That workaround won't work for some domains. For example,
>>> socket.getaddrinfo(s,"http",0,0,socket.SOL_TCP,socket.AI_CANONNAME)
[(2, 1, 6, 'orig-10005.themarker.cotcdn.net', ('208.93.137.80', 80))]
Nor will addiing options to /etc/resolv.conf work well, because
that file is overwritten by some system administration programs.
I may have to bring in "dnspython" to get a reliable DNS lookup.
John Nagle
--
http://mail.python.org/mailman/listinfo/python-list