Hi,
I am trying to dump my linkdb content for analysis using the following
command:
bin/nutch readlinkdb crawl/linkdb -dump readlinkdb_dump
I receive the following output in my shell:
LinkDb dump: starting
LinkDb db: crawl/linkdb/
After that the readlinkdb_dump folder exists and in it the 2 file
The parsed html files are saved in "segments"
On Fri, Apr 9, 2010 at 3:40 AM, cefurkan0 cefurkan0 wrote:
> i can successfully crawl web sites with
>
> bin/nutch crawl command
>
> but i also want to save parsed html files
>
> how can i do that
>
> ty
>