UnknownHostException is certainly related to DNS, so I would focus investigations there initially. It sounds like the DNS request is timing out, and this is not uncommon if your DNS server is a cheap home router (I have found that some of these do not handle concurrency very well). For example, these are the kinds of things I'd explore:
1) Are you definitely running a caching nameserver? Is it on your local machine? If not, you could explore this. 2) Have you tried a different DNS server? e.g. Google's at 8.8.8.8 or 8.8.4.4 3) Are the requests all for the same hostname(s)? If so, you could consider performing the lookups at the start, storing them in an InMemoryDnsResolver (http://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/impl/conn/InMemoryDnsResolver.html) and then never rely upon external DNS lookups after that. Hope this helps, Sam On 23 July 2012 19:16, Jean-Marc Spaggiari <[email protected]> wrote: > Hi, > > I have an application where I'm trying to read about 30 URLs at a time > from a 5000 URLs' list. > > I have implemented 30 threads to retrieve the content. > > I'm initialysing the HttpClient that way: > > SchemeRegistry schemeRegistry = new SchemeRegistry(); > schemeRegistry.register(new Scheme("http", 80, > PlainSocketFactory.getSocketFactory())); > schemeRegistry.register(new Scheme("https", 443, > SSLSocketFactory.getSocketFactory())); > PoolingClientConnectionManager cm = new > PoolingClientConnectionManager(schemeRegistry); > cm.setMaxTotal(200); > cm.setDefaultMaxPerRoute(20); > > HttpParams params = new BasicHttpParams(); > client = new DefaultHttpClient(cm, params); > client.getParams().setParameter(CoreConnectionPNames.SO_TIMEOUT, new > Integer(45000)); > client.getParams().setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, > new Integer(45000)); > client.getParams().setParameter(CoreConnectionPNames.TCP_NODELAY, false); > > > However, when I'm running my threads and retrieving the content, I'm > often getting an UnknownHostException on an host I know it exists. > > Like, if I have 500 URLs from this host, some of them will be retrieve > correctly and some will throw an UnknownHostException. > > I'm wondering where it might be coming from. > > Each thread is creating an HTTPGet method and is invoking it using the > client created above (only one instance of the client for the entire > application). > > Initialy, I thought this was because of the DNS. But since it's the > same host that is sometime working, that mean it should be on the > cache now. > > Should it be better for me to create one client per thread? > > Thanks for your comments. > > JM > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
