I like using Nutch for the crawlDB, scalability, threading, document parsing, ... but crawling is not important to me as I index targeted data sources.

Obviously, I'm using it with Solr for indexing and searching documents.

Fabrice

Alexander Aristov a écrit :
Nutch primarily is a crawler. I would suggest you to take a look at solr
which is just indexer and searcher. You may use it's API as well as open
interfaces

Best Regards
Alexander Aristov


2009/8/12 Fabrice Estiévenart <[email protected]>

Hello,

How can I use Nutch Java objects to index one (or a very limited set of)
web page(s) without crawling them ?

Do I need to use the crawling tools (such as Injector, Generator, ...) or
can I do it by the means of lower-level objects (Content, ParseResult, ...)
?

Thanks for your help,

Fabrice




--
Fabrice Estiévenart, Ingénieur R&D, CETIC
Tél : +32 (0)71/49.07.28
Web : http://www.cetic.be

Reply via email to