Hi All,

I'm trying to limit the bin/nutch generate for 1 Million of unfetched urls.
It'is possible to do this with nutch?
What's the best way to limit generate command without use -topN option ?


In my crawldb, I have 55 Millions of unfetched urls... The generate segment
is too big for fetch.

Can you help me please?

Thanks for your help,

Max.

Server description:
cpu :Q6600
ram: 4Go
disk: 750Go
bandwidth: 100Mbp/s

Reply via email to