Re: Which Java objects to index a web page ?

Fabrice Estiévenart Wed, 12 Aug 2009 09:00:28 -0700

I like using Nutch for the crawlDB, scalability, threading, documentparsing, ... but crawling is not important to me as I index targeteddata sources.


Obviously, I'm using it with Solr for indexing and searching documents.


Fabrice

Alexander Aristov a écrit :

Nutch primarily is a crawler. I would suggest you to take a look at solr
which is just indexer and searcher. You may use it's API as well as open
interfaces

Best Regards
Alexander Aristov


2009/8/12 Fabrice Estiévenart <[email protected]>

Hello,

How can I use Nutch Java objects to index one (or a very limited set of)
web page(s) without crawling them ?

Do I need to use the crawling tools (such as Injector, Generator, ...) or
can I do it by the means of lower-level objects (Content, ParseResult, ...)
?

Thanks for your help,

Fabrice



--
Fabrice Estiévenart, Ingénieur R&D, CETIC
Tél : +32 (0)71/49.07.28
Web : http://www.cetic.be

Re: Which Java objects to index a web page ?

Reply via email to