Ken Krugler wrote:
We're only using the html & text parsers, so I don't think that's the problem. Plus we dumping the thread stack when it hangs, and it's always in the ChunkedInputStream.exhaustInputStream() process (see trace below).

The trace did not make it.

Have you tried protocol-http instead of protocol-httpclient? Is it any better? What JVM are you running? I get fewer socket hangs in 1.5 than 1.4.

Also, the mapred fetcher has been changed to succeed even when threads hang. Perhaps we should change the 0.7 fetcher similarly? I think we should probably go even farther, and kill threads which take longer than a timeout to process a url. Thread.stop() is theoretically unsafe, but I've used it in the past for this sort of thing and never traced subsequent problems back to it...

Doug

Reply via email to