Hello HC Experts, I would be very greatful for an advice regarding my question. I already spend a lot of time searching the internet, but I am still have not found an example that answers my questions. There are lot of examples available (also for the multithreaded use-cases) but the only adress the use-case making one(!!) request. I am completely uncertain how to "best" make a series of requests (to the same webserver).
I need to develop a simple Crawler that crawls some websites for specific information. The Basic idea is to download the single webpages of a website (for example www.a.com) sequentially but run several of these "sequential" downloaders in threads for different webpages (www.b.com and www.c.com) in parallel. My current concept/implementation looks like this: 1. Instanciate a ThreadSafeClientConnManager (with a lot of default parameters). This connection Manager will be used/shared by all "DefaultHttpClient's"s 2. For every Webpage (of a Website, with multiple webpages), I Instanciate for every(!!) webpage-request a new DefaultHttpClient and then call the "httpClient.execute(httpGet)" method with the instanciated GetMethod(url). ==> I am more and more wondering if this is the correct usage of the DefaultHttpClient and the .execute() Method. Am I doing something wrong here, to instanciate a new DefaultHttpClient for every request of a wepage? Or should I rather instanciate only one(!!) DefaultHttpClient and then share this for the sequential .execute() calls? To be honest, what I also have not really understood yet is the Cookie Management. Do I as the Programmer have to instanciate the CookieStore manually 1. httpClient.setCookieStore(new BasicCookieStore()); and then after calling the .execute() method "get" the Cookie store 2. savedcookies = httpClient.getCookieStore() and then reinject this cookie store for the next call to the same wepage (to maintain state)? 3. httpClient.setCookie(savedcookies) Or is there some implicit magic that A) does create the cookie store implicitly and B) somehow shares this CookieStore among the HttpClients and/or HttpGet's? Thank you very much!! Jens
