On Wed, 2010-01-27 at 20:42 +0100, Jens Mueller
[email protected] wrote:
> Hello HC Experts,
> 
> I would be very greatful for an advice regarding my question. I already
> spend a lot of time searching the internet, but I am still have not found an
> example that answers my questions. There are lot of examples available (also
> for the multithreaded use-cases) but the only adress the use-case making
> one(!!) request. I am completely uncertain how to "best" make a series of
> requests (to the same webserver).
> 
> I need to develop a simple Crawler that crawls some websites for specific
> information. The Basic idea is to download the single webpages of a website
> (for example www.a.com) sequentially but run several of these "sequential"
> downloaders in threads for different webpages (www.b.com and www.c.com) in
> parallel.
> 
> My current concept/implementation looks like this:
> 
> 1.  Instanciate a ThreadSafeClientConnManager (with a lot of default
> parameters). This connection Manager will be used/shared by all
> "DefaultHttpClient's"s
> 2.  For every Webpage (of a Website, with multiple webpages), I Instanciate
> for every(!!) webpage-request a new DefaultHttpClient and then call the
> "httpClient.execute(httpGet)" method with the instanciated GetMethod(url).
> 
> ==> I am more and more wondering if this is the correct usage of the
> DefaultHttpClient and the .execute() Method. Am I doing something wrong
> here, to instanciate a new DefaultHttpClient for every request of a wepage?
> Or should I rather instanciate only one(!!) DefaultHttpClient and then share
> this for the sequential .execute() calls?
> 
> To be honest, what I also have not really understood yet is the Cookie
> Management. Do I as the Programmer have to instanciate the CookieStore
> manually
> 1. httpClient.setCookieStore(new BasicCookieStore());
> and then after calling the .execute() method "get" the Cookie store
> 2. savedcookies = httpClient.getCookieStore()
> and then reinject this cookie store for the next call to the same wepage (to
> maintain state)?
> 3. httpClient.setCookie(savedcookies)
> Or is there some implicit magic that A) does create the cookie store
> implicitly and B) somehow shares this CookieStore among the HttpClients
> and/or HttpGet's?
> 
> Thank you very much!!
> Jens

Jens,

Re-use HttpClient instance for all execution threads but create a
separate HttpContext and CookieStore per thread of execution /
individual user, as described by Ken.

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to