Hi, I've observed an interesting phenomenon that is not hard to reproduce and that I think should not be happening:
If you have N fetcher threads, inject, say, 2xN URLs of VERY large files plus a few smaller files to fetch and run something that uses org.apache.nutch.crawl.Crawl. The big files will take forever to download and the threads will be killed. The process then will proceed to the indexing stage. However, you will see fetcher threads output in the logs intermixed with the output of the indexer. This shows that they were not terminated properly (or at all?). Regards, Arkadi

