Hi Sam, Thanks a lot for your reply.
I have also suspected the DNS to be the root cause of this issue so I moved to OpenDNS servers but I'm still getting the errors. I have never suspected that my router might be one of the reason why it's failing. So I will try all your suggestions. I will first buy a new router. Mine is 5 or 6 years old and I was already thinking of replacing it anyway. I will also try google servers, and probably the InMemoryDNSResolver since it will allow me do retry the DNS requests if the first one is failing... For the DNS cache I thought I had one but based on the way the application is responding, I'm suspecting it to not beeing used. So I will try all of that a provide a feedback here in case someone is facing same kind of issue in the futur. JM 2012/7/24, Sam Crawford <[email protected]>: > UnknownHostException is certainly related to DNS, so I would focus > investigations there initially. It sounds like the DNS request is > timing out, and this is not uncommon if your DNS server is a cheap > home router (I have found that some of these do not handle concurrency > very well). For example, these are the kinds of things I'd explore: > > 1) Are you definitely running a caching nameserver? Is it on your > local machine? If not, you could explore this. > 2) Have you tried a different DNS server? e.g. Google's at 8.8.8.8 or > 8.8.4.4 > 3) Are the requests all for the same hostname(s)? If so, you could > consider performing the lookups at the start, storing them in an > InMemoryDnsResolver > (http://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/impl/conn/InMemoryDnsResolver.html) > and then never rely upon external DNS lookups after that. > > Hope this helps, > > Sam > > > On 23 July 2012 19:16, Jean-Marc Spaggiari <[email protected]> wrote: >> Hi, >> >> I have an application where I'm trying to read about 30 URLs at a time >> from a 5000 URLs' list. >> >> I have implemented 30 threads to retrieve the content. >> >> I'm initialysing the HttpClient that way: >> >> SchemeRegistry schemeRegistry = new SchemeRegistry(); >> schemeRegistry.register(new Scheme("http", 80, >> PlainSocketFactory.getSocketFactory())); >> schemeRegistry.register(new Scheme("https", 443, >> SSLSocketFactory.getSocketFactory())); >> PoolingClientConnectionManager cm = new >> PoolingClientConnectionManager(schemeRegistry); >> cm.setMaxTotal(200); >> cm.setDefaultMaxPerRoute(20); >> >> HttpParams params = new BasicHttpParams(); >> client = new DefaultHttpClient(cm, params); >> client.getParams().setParameter(CoreConnectionPNames.SO_TIMEOUT, new >> Integer(45000)); >> client.getParams().setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, >> new Integer(45000)); >> client.getParams().setParameter(CoreConnectionPNames.TCP_NODELAY, false); >> >> >> However, when I'm running my threads and retrieving the content, I'm >> often getting an UnknownHostException on an host I know it exists. >> >> Like, if I have 500 URLs from this host, some of them will be retrieve >> correctly and some will throw an UnknownHostException. >> >> I'm wondering where it might be coming from. >> >> Each thread is creating an HTTPGet method and is invoking it using the >> client created above (only one instance of the client for the entire >> application). >> >> Initialy, I thought this was because of the DNS. But since it's the >> same host that is sometime working, that mean it should be on the >> cache now. >> >> Should it be better for me to create one client per thread? >> >> Thanks for your comments. >> >> JM >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
