Hi Bayu,

 
You must enabled the protocol-file first. Then make sure the file:// prefix is 
not filtered via prefix-urlfilter.txt or any other. Now just inject new URL's 
and start the crawl.

 
Cheers


 
-----Original message-----
From:Bayu Widyasanyata <[email protected]>
Sent:Wed 04-06-2014 14:30
Subject:Crawling web and intranet files into single crawldb
To:[email protected]; 
Hi,

I successfully running nutch 1.8 and Solr 4.8.1 to fetch and index web
sources (http protocol).
And now I want add file share data sources (file protocol) into current
crawldb.

What is the strategy or common practices to handle this situations?

Thank you.-

-- 
wassalam,
[bayu]

Reply via email to