Re: Lucene and Nutch

Chris Hostetter Tue, 10 Jan 2006 16:48:08 -0800

: to index remote HTML files.  Can I use Nutch to crawl for the remote HTML
: files and use the index for the Lucene code I have already written?  Or do
: I have to redo the whole thing using the Nutch API?  I am using boosting
: during the indexing.  I hope Nutch can boost fields, too.  Any help would
: be appreciated.


thebest place to start with a question like this is the Nutch
documentation and user community -- between hose two information sources,
you should be able to determine what constraints nutch puts on the
fields of the index it creates, and what flexability you have to affect
field/document boosts at index time.

With that information in hand, you can make an informed choice about using
nutch in conjunction with your direct lucene access code, re-writing your
code to use whatever api nutch has, or using a third party crawler to
fetch documents for your lucene based code and ignoring nutch.




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lucene and Nutch

Reply via email to