Hi Everyone, I want to know if it possible to generate multiple fetchlists from the generator by 'Host' or any other user specified criteria (like a regex) ? If a single large fetchlist is generated, it causes the fetcher to run for too long. It would be nice if the URLs could be in separate fetchlists specified by some criteria making it easier to analyze large crawls and not having to wait for the entire fetch job to finish.
I was reading the documentation at http://wiki.apache.org/nutch/bin/nutch%20generate The property numFetchers and maxNumSegments do talk about generating multiple fetch partitions and segments. And generate.max.count, generate.count.mode allow some configurations. But I did not understand if it is possible to generate multiple fetchlists (I am currently working in a local mode) Thank you. Regards, Sujen Shah M.S - Computer Science (Class of 2016) University of Southern California http://www.linkedin.com/in/sujenshah

