[In the interest of not hijacking Tony's discussion thread, I'm
putting this into a new email.]
Tony Poppleton wrote:
Hi,
Further to the previous mail, I have already implemented my own
AbstractHttpEntity to eliminate a byte[] copy. And I have seen the
NIO implementations of HttpEntities, however they don't seem to
copy using NIO methods so they won't be any faster than the
standard IO implementations.
Anyway, it seems I have to go a level deeper than this class to be
able to do the NIO copy. Is this the right direction to be digging
in?
Thanks,
Tony
Tony
Contrary to a common misconception, NIO is significantly slower than
the classic blocking I/O in terms of raw data throughput. Modern
operating systems and JVMs have become pretty efficient at switching
thread contexts. Connection multiplexing starts paying off only when
the number of concurrent connections exceeds 2000 or direct data
streaming from or to a file is used.
I agree that NIO is often incorrectly viewed as a panacea for all
network performance issues.
I did want to mention that there are some multi-threading performance
issues which potentially NIO would avoid, for those who are using
HttpClient with 100s of threads.
For example, during a Bixo crawl with 300 threads, I was doing regular
thread dumps and inspecting the results. A very high percentage
(typically > 1/3) were blocked while waiting to get access to the
cookie store. By default there's only one of these per HttpClient.
This one was fairly easy to work around, by creating a cookie store in
the local context for each request:
CookieStore cookieStore = new BasicCookieStore();
localContext.setAttribute(ClientContext.COOKIE_STORE,
cookieStore);
But I've run into a few other synchronized method/data bottlenecks,
which I'm still working through. For example, at irregular intervals
the bulk of my fetcher threads are blocked on getting the scheme
registry, either:
"pool-1-thread-9478" prio=10 tid=0x8e9ec400 nid=0x1fb waiting for
monitor entry [0x8ee2e000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.http.conn.scheme.SchemeRegistry.get(SchemeRegistry.java:
106)
- waiting to lock <0x93f2c0c8> (a
org.apache.http.conn.scheme.SchemeRegistry)
at
org
.apache
.http.client.protocol.RequestAddCookies.process(RequestAddCookies.java:
154)
at
org
.apache
.http.protocol.BasicHttpProcessor.process(BasicHttpProcessor.java:251)
at
org
.apache
.http.protocol.HttpRequestExecutor.preProcess(HttpRequestExecutor.java:
168)
at
org
.apache
.http
.impl
.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:422)
or
"pool-1-thread-9470" prio=10 tid=0x8e9e7c00 nid=0x1f1 waiting for
monitor entry [0x8d986000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
org
.apache.http.conn.scheme.SchemeRegistry.getScheme(SchemeRegistry.java:
71)
- waiting to lock <0x93f2c0c8> (a
org.apache.http.conn.scheme.SchemeRegistry)
at
org
.apache
.http
.impl
.conn
.DefaultHttpRoutePlanner.determineRoute(DefaultHttpRoutePlanner.java:
111)
at
org
.apache
.http
.impl
.client
.DefaultRequestDirector.determineRoute(DefaultRequestDirector.java:619)
at
org
.apache
.http
.impl
.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:319)
If anybody (well, OK, Oleg) has input on things I could be doing wrong
to trigger this type of behavior, and/or ways to avoid it, I'm all ears.
-- Ken
--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c w e b m i n i n g