Ah yes, that makes sense.

In my scenario I'm using HttpClient as the client-side of a reverse
proxy, and therefore can't use a single context per server (as we have
multiple users accessing backend servers simultaneously, so their
cookies get all mixed up).

Thanks,

Sam


2010/1/27 Ken Krugler <[email protected]>:
> You can create a local context and use that for all requests to the same
> server. This then lets you re-use the same HttpClient, which is how you want
> to handle this (versus creating new instances for each domain).
>
> For example, in Bixo's SimpleHttpFetcher there's this code:
>
>            getter = new HttpGet(new URI(url));
>
>            // Create a local instance of cookie store, and bind to local
> context
>            // Without this we get killed w/lots of threads, due to sync() on
> single cookie store.
>            HttpContext localContext = new BasicHttpContext();
>            CookieStore cookieStore = new BasicCookieStore();
>            localContext.setAttribute(ClientContext.COOKIE_STORE,
> cookieStore);
>            response = _httpClient.execute(getter, localContext);
>
> The call to execute the GET request uses the localContext, which is what I
> think Jens want.
>
> -- Ken
>
>
> On Jan 27, 2010, at 3:22pm, Sam Crawford wrote:
>
>> I could well be mistaken, but my experience suggests that with version
>> 4.0 you need a new HttpClient each time you deal with a different set
>> of cookies. Creating multiple HttpContexts used across a single
>> DefaultHttpClient instance did not seem to be sufficient.
>>
>> That said, I only tried this briefly and didn't spend a huge amount of
>> time investigating it. I keep meaning to do so and to submit a bug if
>> I find a genuinely reproducible issue.
>>
>> Thanks,
>>
>> Sam
>>
>>
>> 2010/1/27 Jens Mueller [email protected]
>> <[email protected]>:
>>>
>>> Hello HC Experts,
>>>
>>> I would be very greatful for an advice regarding my question. I already
>>> spend a lot of time searching the internet, but I am still have not found
>>> an
>>> example that answers my questions. There are lot of examples available
>>> (also
>>> for the multithreaded use-cases) but the only adress the use-case making
>>> one(!!) request. I am completely uncertain how to "best" make a series of
>>> requests (to the same webserver).
>>>
>>> I need to develop a simple Crawler that crawls some websites for specific
>>> information. The Basic idea is to download the single webpages of a
>>> website
>>> (for example www.a.com) sequentially but run several of these
>>> "sequential"
>>> downloaders in threads for different webpages (www.b.com and www.c.com)
>>> in
>>> parallel.
>>>
>>> My current concept/implementation looks like this:
>>>
>>> 1.  Instanciate a ThreadSafeClientConnManager (with a lot of default
>>> parameters). This connection Manager will be used/shared by all
>>> "DefaultHttpClient's"s
>>> 2.  For every Webpage (of a Website, with multiple webpages), I
>>> Instanciate
>>> for every(!!) webpage-request a new DefaultHttpClient and then call the
>>> "httpClient.execute(httpGet)" method with the instanciated
>>> GetMethod(url).
>>>
>>> ==> I am more and more wondering if this is the correct usage of the
>>> DefaultHttpClient and the .execute() Method. Am I doing something wrong
>>> here, to instanciate a new DefaultHttpClient for every request of a
>>> wepage?
>>> Or should I rather instanciate only one(!!) DefaultHttpClient and then
>>> share
>>> this for the sequential .execute() calls?
>>>
>>> To be honest, what I also have not really understood yet is the Cookie
>>> Management. Do I as the Programmer have to instanciate the CookieStore
>>> manually
>>> 1. httpClient.setCookieStore(new BasicCookieStore());
>>> and then after calling the .execute() method "get" the Cookie store
>>> 2. savedcookies = httpClient.getCookieStore()
>>> and then reinject this cookie store for the next call to the same wepage
>>> (to
>>> maintain state)?
>>> 3. httpClient.setCookie(savedcookies)
>>> Or is there some implicit magic that A) does create the cookie store
>>> implicitly and B) somehow shares this CookieStore among the HttpClients
>>> and/or HttpGet's?
>>>
>>> Thank you very much!!
>>> Jens
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
> --------------------------------------------
> Ken Krugler
> +1 530-210-6378
> http://bixolabs.com
> e l a s t i c   w e b   m i n i n g
>
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to