sebb wrote:
On 09/01/2010, Ken Krugler <[email protected]> wrote:
[In the interest of not hijacking Tony's discussion thread, I'm putting this
into a new email.]


Tony Poppleton wrote:

Hi,
Further to the previous mail, I have already implemented my own
AbstractHttpEntity to eliminate a byte[] copy.  And I have seen the NIO
implementations of HttpEntities, however they don't seem to copy using NIO
methods so they won't be any faster than the standard IO implementations.
Anyway, it seems I have to go a level deeper than this class to be able
to do the NIO copy.  Is this the right direction to be digging in?
Thanks,
Tony

Tony

Contrary to a common misconception, NIO is significantly slower than the
classic blocking I/O in terms of raw data throughput. Modern operating
systems and JVMs have become pretty efficient at switching thread contexts.
Connection multiplexing starts paying off only when the number of concurrent
connections exceeds 2000 or direct data streaming from or to a file is used.
 I agree that NIO is often incorrectly viewed as a panacea for all network
performance issues.

 I did want to mention that there are some multi-threading performance
issues which potentially NIO would avoid, for those who are using HttpClient
with 100s of threads.

 For example, during a Bixo crawl with 300 threads, I was doing regular
thread dumps and inspecting the results. A very high percentage (typically >
1/3) were blocked while waiting to get access to the cookie store. By
default there's only one of these per HttpClient.

 This one was fairly easy to work around, by creating a cookie store in the
local context for each request:

            CookieStore cookieStore = new BasicCookieStore();

localContext.setAttribute(ClientContext.COOKIE_STORE,
cookieStore);

 But I've run into a few other synchronized method/data bottlenecks, which
I'm still working through. For example, at irregular intervals the bulk of
my fetcher threads are blocked on getting the scheme registry, either:

 "pool-1-thread-9478" prio=10 tid=0x8e9ec400 nid=0x1fb waiting for monitor
entry [0x8ee2e000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at
org.apache.http.conn.scheme.SchemeRegistry.get(SchemeRegistry.java:106)
        - waiting to lock <0x93f2c0c8> (a
org.apache.http.conn.scheme.SchemeRegistry)
        at
org.apache.http.client.protocol.RequestAddCookies.process(RequestAddCookies.java:154)
        at
org.apache.http.protocol.BasicHttpProcessor.process(BasicHttpProcessor.java:251)
        at
org.apache.http.protocol.HttpRequestExecutor.preProcess(HttpRequestExecutor.java:168)
        at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:422)

 or

 "pool-1-thread-9470" prio=10 tid=0x8e9e7c00 nid=0x1f1 waiting for monitor
entry [0x8d986000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at
org.apache.http.conn.scheme.SchemeRegistry.getScheme(SchemeRegistry.java:71)
        - waiting to lock <0x93f2c0c8> (a
org.apache.http.conn.scheme.SchemeRegistry)
        at
org.apache.http.impl.conn.DefaultHttpRoutePlanner.determineRoute(DefaultHttpRoutePlanner.java:111)
        at
org.apache.http.impl.client.DefaultRequestDirector.determineRoute(DefaultRequestDirector.java:619)
        at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:319)

 If anybody (well, OK, Oleg) has input on things I could be doing wrong to
trigger this type of behavior, and/or ways to avoid it, I'm all ears.

Looks like the code could use ConcurrentHashMap instead of LinkedHashMap?
All the methods could then be unsynchronised.

The only method which would be affected by the change in ordering is
getSchemeNames(). The Javadoc for this is a bit unclear (to me) but
the test case shows that insertion order is not important (so I'm not
sure why LinkedHashMap was used originally).


Good catch, Sebastian!

I am pretty certain ordering does not matter. I am not longer sure why LinkedHashMap was chosen in the first place.

Would you have time for looking into HTTPCLIENT-903?

Oleg

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to