Hi Schmed,
Thanks for posting this. One quick note - I never use [EMAIL PROTECTED]
when posting, to keep it spam free, so either a bcc or
[EMAIL PROTECTED] in the cc field would be better.
-- Ken
My crawl died with an InterruptedException the other day, and I'm
wondering whether any of you have run into the same problem.
Reviewing the code, it seems like the SocketThread spawned by
org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory.createSocket()
and executed via the
org.apache.commons.httpclient.util.TimeoutController.execute()
method ought to be overriding the Thread.interrupt() method [at
least according to the documentation of execute()].
I'm guessing that this SocketThread didn't terminate within the
timeout value (I should probably research what this might have been
set to), so execute() called Task.interrupt(). SocketThread doesn't
override timeout(), and when the InterruptedException was thrown,
the thread was in
sun.security.provider.SecureRandom.engineNextBytes() (at line 168 of
SecureRandom.java - wish I had the source). I'm guessing that at
line 168 engineNextBytes() does a synchronize or calls something
like wait(), so the Thread.interrupt() results in an
InterruptedException being thrown within SocketThread. Obviously
nobody is catching InterruptedException in the stack trace below:
051111 182157 33 SEVERE null
java.lang.InterruptedException
at
sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:168)
at java.security.SecureRandom.nextBytes(SecureRandom.java:381)
at java.security.SecureRandom.next(SecureRandom.java:403)
at java.util.Random.nextInt(Random.java:191)
at com.sun.net.ssl.internal.ssl.SSLContextImpl.engineInit(DashoA12275)
at javax.net.ssl.SSLContext.init(DashoA12275)
at com.sun.net.ssl.SSLContextSpiWrapper.engineInit(DashoA12275)
at com.sun.net.ssl.SSLContext.init(DashoA12275)
at
org.apache.nutch.protocol.httpclient.DummySSLProtocolSocketFactory.createEasySSLContext(DummySSLProtocolSocketFactory.java:45)
at
org.apache.nutch.protocol.httpclient.DummySSLProtocolSocketFactory.getSSLContext(DummySSLProtocolSocketFactory.java:55)
at
org.apache.nutch.protocol.httpclient.DummySSLProtocolSocketFactory.createSocket(DummySSLProtocolSocketFactory.java:66)
at
org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$1.doit(ControllerThreadSocketFactory.java:90)
at
org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$SocketTask.run(ControllerThreadSocketFactory.java:157)
at java.lang.Thread.run(Thread.java:534)
051111 182158 10 SEVERE Fetcher encountered a fatal error
051111 182158 10 SEVERE SEVERE error logged. Exiting fetcher.
051111 182158 10 SEVERE org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:437)
051111 182158 10 SEVERE
org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:596)
051111 182158 10 SEVERE
net.krugle.nutchdriver.NutchDriver.main(NutchDriver.java:232)
Searching the web, it seems like several others have complained
about SecureRandom.java being unable to consistently return
pseudo-random integers in a timely fashion. Several of these people
suggest setting securerandom.source=file:/dev/urandom instead of
file:/dev/random. Here's an example:
http://jira.atlassian.com/browse/CONF-2848
Do any Nutch users have experience using file:/dev/random?
Thanks,
- Chris
--
------------------------
Chris Schneider
TransPac Software, Inc.
[EMAIL PROTECTED]
------------------------
--
Ken Krugler
Krugle, Inc.
+1 530-470-9200