I'm debugging a distributed carwler, and it apperas
that over time all of my download threads get hung up
in the same way. A jrockit thread dump reveals the
following:
"ThreadPool-Crawler DownloadJob
PooledThread-1-running" id=436 idx=0x62
tid=-1569055552 prio=5 alive, in native, daemon
at jrockit/net/SocketNativeIO.read(IIII)I(Native
Method)
at
jrockit/net/SocketNativeIO.read(Ljava/io/FileDescriptor;III)I(Unknown
Source)[inlined]
at java/net/AbstractSocketImpl$1.read(II)I(Unknown
Source)[optimized]
^-- Holding lock:
java/net/[EMAIL PROTECTED] lock]
at jrockit/io/NativeIOInputStream.read(I[BI)I(Unknown
Source)[inlined]
at jrockit/io/NativeIOInputStream.read([BII)I(Unknown
Source)[optimized]
at
java/io/BufferedInputStream.fill()V(BufferedInputStream.java:218)[inlined]
at
java/io/BufferedInputStream.read()I(BufferedInputStream.java:235)[optimized]
^-- Holding lock:
java/io/[EMAIL PROTECTED] lock]
at
org/apache/commons/httpclient/HttpParser.readRawLine(Ljava/io/InputStream;)[B(HttpParser.java:77)[optimized]
at
org/apache/commons/httpclient/HttpParser.readLine(Ljava/io/InputStream;Ljava/lang/String;)Ljava/lang/String;(HttpParser.java:105)[inlined]
at
org/apache/commons/httpclient/HttpConnection.readLine(Ljava/lang/String;)Ljava/lang/String;(HttpConnection.java:1113)[optimized]
at
org/apache/commons/httpclient/MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(Ljava/lang/String;)Ljava/lang/String;(MultiThreadedHttpConnectionMa
nager.java:1373)[optimized]
at
org/apache/commons/httpclient/HttpMethodBase.readStatusLine(Lorg/apache/commons/httpclient/HttpState;Lorg/apache/commons/httpclient/HttpConnection;)V(HttpMethodBas
e.java:1832)[optimized]
at
org/apache/commons/httpclient/HttpMethodBase.readResponse(Lorg/apache/commons/httpclient/HttpState;Lorg/apache/commons/httpclient/HttpConnection;)V(HttpMethodBase.
java:1590)[inlined]
at
org/apache/commons/httpclient/HttpMethodBase.execute(Lorg/apache/commons/httpclient/HttpState;Lorg/apache/commons/httpclient/HttpConnection;)I(HttpMethodBase.java:
995)[optimized]
at
org/apache/commons/httpclient/HttpMethodDirector.executeWithRetry(Lorg/apache/commons/httpclient/HttpMethod;)V(HttpMethodDirector.java:395)[optimized]
at
org/apache/commons/httpclient/HttpMethodDirector.executeMethod(Lorg/apache/commons/httpclient/HttpMethod;)V(HttpMethodDirector.java:170)[optimized]
at
org/apache/commons/httpclient/HttpClient.executeMethod(Lorg/apache/commons/httpclient/HostConfiguration;Lorg/apache/commons/httpclient/HttpMethod;Lorg/apache/commo
ns/httpclient/HttpState;)I(HttpClient.java:396)[optimized]
at
org/apache/commons/httpclient/HttpClient.executeMethod(Lorg/apache/commons/httpclient/HttpMethod;)I(HttpClient.java:324)[inlined]
at
crawler/util/FetcherUtil.getContentAsString(Ljava/lang/String;)Ljava/lang/String;(FetcherUtil.java:68)[optimized]
at
crawler/fetch/Downloadable.run()V(Downloadable.java:44)[optimized]
at
services/threadpool/ThreadPoolThread.run()V(ThreadPoolThread.java:83)
at jrockit/vm/RNI.c2java()V(Native Method)
Sorry about the formatting.
My socket timeout is set to 54 seconds, so I don't
understand why these sockets go off into space for
many hours, never to come back.
Here's my HttpClient initialization code:
static {
HttpConnectionManagerParams connMgrParams =
new HttpConnectionManagerParams();
connMgrParams.setConnectionTimeout(4000);
connMgrParams.setSoTimeout(4000);
connMgrParams.setMaxConnectionsPerHost(
HostConfiguration.ANY_HOST_CONFIGURATION,1);
connMgrParams.setMaxTotalConnections(300);
MultiThreadedHttpConnectionManager connMgr =
new MultiThreadedHttpConnectionManager();
connMgr.setParams(connMgrParams);
httpClient= new HttpClient(connMgr);
}
I also noticed in the jrockit memory leak tool that
even though I set ignoreCookies to be true before
every call, I still acumulate thousands of then. Is
this normal?
Cheers,
George
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]