It would be nice to have a brief summary comparison of the web crawling features of LCF relative to nutch. I personally don't know the details of nutch other than a quick read of the tutorial, but I am wondering whether there are any features of nutch web crawling that may not be available in the LCF web crawl connector.
A second question is whether nutch has any performance or volume advantage over LCF for web crawling, in a general, rough sense, although some specific performance tests for LCF will eventually be good to have. I would envision people using LCF to crawl desired web sites rather than the whole web, but the number of desired sites to be crawled could still be a moderately large number. At some point we should publish some guidelines as to what amount of web crawling LCF is targeted to support, in a general, rough sense. (Answers could go in the LCF FAQ.) Thanks. -- Jack Krupansky
