Ken Krugler wrote:
We're only using the html & text parsers, so I don't think that's the
problem. Plus we dumping the thread stack when it hangs, and it's always
in the ChunkedInputStream.exhaustInputStream() process (see trace below).
The trace did not make it.
Have you tried protocol-http instead of protocol-httpclient? Is it any
better? What JVM are you running? I get fewer socket hangs in 1.5 than
1.4.
Also, the mapred fetcher has been changed to succeed even when threads
hang. Perhaps we should change the 0.7 fetcher similarly? I think we
should probably go even farther, and kill threads which take longer than
a timeout to process a url. Thread.stop() is theoretically unsafe, but
I've used it in the past for this sort of thing and never traced
subsequent problems back to it...
Doug