Awesome!  Thanks.

On Tue, Jul 28, 2009 at 12:26 PM, reinhard schwab <reinhard.sch...@aon.at>wrote:

> yes, there are tools which you can use to dump the content of crawl db,
> link db and segments.
>
> dump=./crawl/dump
> bin/nutch readdb $crawl/crawldb -dump $dump/crawldb
> bin/nutch readlinkdb $crawl/linkdb -dump $dump/linkdb
> bin/nutch readseg -dump $1 $dump/segments/$1
>
> you will get more info if you call
>
> bin/nutch readdb
> bin/nutch readlinkdb
> bin/nutch readseg
>
> Paul Tomblin schrieb:
> > The nutch data files are pretty opaque, and even "strings" can't extract
> > anything except the occasional URL.  Is there any code to dump the
> contents
> > of the various files in a human readable form?
> >
> >
>
>


-- 
http://www.linkedin.com/in/paultomblin

Reply via email to