Re: Dump all urls from merged index

Markus Jelsma Tue, 07 Jun 2011 13:32:13 -0700

Well, you can dump the crawldb using the bin/nutch readdb command. You'd still 
need to parse the output youself to get a decent list of URL's.


> Hi guys,
> 
> I was wondering if there is a quick method to dump all urls of a merged
> index (ie a production index).
> I want to use them them  for a 'fresh' seeding of a new crawldb

Re: Dump all urls from merged index

Reply via email to