In the tutorial, see the whole-web crawling method, which gives the steps:
inject urls,
fetch and parse,
index ....
The same steps apply for crawling the local file systems also. In the inject
urls step, give the urls as file:/<path> to the directory where you have the
files to index. Also be sure to use the RegexFilter file to prevent it from
grabbing the parent directory (".."). Otherwise it will go grab files from the
directories above the one you specified also.
baoshenghua <[EMAIL PROTECTED]> said:
> hi,all
> I am quite new to nutch system, any advice for me is appreciated.
> I'd like to use nutch to search for a local file system, however it
> seems that the only way to create the index is by crawl the web.
> I have read the online doc and not foud a proper solution yet. Can
> any one help me? thanks again!
>
> Yours,
> Shenghua Bao
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: InterSystems CACHE
> FREE OODBMS DOWNLOAD - A multidimensional database that combines
> robust object and relational technologies, making it a perfect match
> for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8
> _______________________________________________
> Nutch-general mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/nutch-general
>
--
-------------------------------------------------------
This SF.Net email is sponsored by: InterSystems CACHE
FREE OODBMS DOWNLOAD - A multidimensional database that combines
robust object and relational technologies, making it a perfect match
for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8
_______________________________________________
Nutch-general mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-general