On Sat, 2010-01-30 at 23:05 +0100, Jens Mueller
[email protected] wrote:
> Hello Oleg, hello Ken, hello Sam,
> 
> thank your very much for your help!!!
> 
> Please allow me to ask one further question. In case the DefaultHttpClient
> would be used on a "website-basis" (that is, I create a new Instance of the
> DefaultHttpClient for downloading a specific website (www.a.com) and then
> create a new DefaultHttpClient for a second website (www.b.com) and the
> DefaultHttpClient is used with the ThreadSafeClientConnManager, do I have to
> somehow  explicitly shutdown the DefaultHttpClient? (The JavaDoc states,
> that when the DefaultHttpClient is used with NO explicitly set Connection
> Manager, then getConnectionManager().shutdown() sould be called, as it
> implicitly creates a SimpleConnectionManager). But is my assumption correct,
> that when I use the TSCCM (with the DefaultHttpClient) that I then do not
> have to do anything at all to leak any ressources (when I no longer require
> the DefaultHttpClient instance). It seems that HttpClient is a very
> heavy-object and maybe there are other resources I have to manually
> "free/shutdown"?

You ought to be using a single instance of DefaultHttpClient /
ThreadSafeClientConnManager per distinct HTTP service (say, one for web
service communication, one for web crawling, and so on). There is no
reason for creating it for each and every target host.


> 
> (I very much appreciate your help and I started to refactor my application.
> I then however had to realize that I have the requirement to have a
> decidated UserAgent for every website I crawl. Using a "Shared
> DefaultHttpClient" (one Instance for the whole application ) with dedicated
> HttpContexts per Website/Thread doesn't work, as I sadly can't set the
> UserAgent on the HttpContext level.

This is correct, but you can always add a custom protocol interceptor
that overrides the default User-Agent header based on an attribute of
the actual HTTP context, such as the name of the target host or some
other custom value.

http://hc.apache.org/httpcomponents-client/tutorial/html/fundamentals.html#protocol_interceptors

>  The UserAgent only seems to be settable
> on the HttpClient or HttpMethod Level. I dont know would this be a
> reasonable feature request/suggestion to also allow HttpParams to be set on
> the HttpContext level that then will take precidence over all other (already
> specified) paramters?

An additional lookup for each and every parameter would have a negative
impact on performance. You should use a custom protocol interceptor to
override parameters that are relevant for your application.

Hope this helps

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to