Awesome! Thanks. On Tue, Jul 28, 2009 at 12:26 PM, reinhard schwab <reinhard.sch...@aon.at>wrote:
> yes, there are tools which you can use to dump the content of crawl db, > link db and segments. > > dump=./crawl/dump > bin/nutch readdb $crawl/crawldb -dump $dump/crawldb > bin/nutch readlinkdb $crawl/linkdb -dump $dump/linkdb > bin/nutch readseg -dump $1 $dump/segments/$1 > > you will get more info if you call > > bin/nutch readdb > bin/nutch readlinkdb > bin/nutch readseg > > Paul Tomblin schrieb: > > The nutch data files are pretty opaque, and even "strings" can't extract > > anything except the occasional URL. Is there any code to dump the > contents > > of the various files in a human readable form? > > > > > > -- http://www.linkedin.com/in/paultomblin