Re: Nutch content with Lucene search

Enis Soztutar Mon, 29 Jan 2007 01:54:27 -0800

Gilbert Groenendijk wrote:

Thank you (and Brian) for your anwsers. I noticed this to, but i wantto getthe content with the java API with Lucene 2.0. If it is impossible, ihave

to write some extensions for my current code but rather not. I guess the
problem is the unstored property. Any config property available for that?


On 1/27/07, Gal Nitzan <[EMAIL PROTECTED]> wrote:



1. Open your index in Luke

2. click on the documents tab

3. click on the next arrow to move to the first document

4. than click on the reconstruct button.

You shall see the content field data in the right pane

HTH

-----Original Message-----
From: Gilbert Groenendijk [mailto:[EMAIL PROTECTED]
Sent: Saturday, January 27, 2007 8:34 PM
To: [email protected]
Subject: Nutch content with Lucene search

Hello,

Today i created a simple index with nutch by command line. After that i
copied the index to the machine to use it with a lucene envirionment, no

Nutch. Fetching the URL and title works pretty good but how can i getthe

content? if i tak a look in Luke, the field content is not stored or

tokenized but when i look in nutch-default.xml and nutch-site.xml, ihave

definied:

<property>
<name>fetcher.store.content</name>
<value>true</value>
<description>If true, fetcher will store content.</description>
</property>

it doesn't seem to work, any idea's?


--
Gilbert Groenendijk
__________________________________________________

Just change the 72nd line in BasicIndexingFilter in index-basic plugin from

doc.add(new Field("content", parse.getText(), Field.Store.NO,Field.Index.TOKENIZED));

to

doc.add(new Field("content", parse.getText(), Field.Store.YES,Field.Index.TOKENIZED));

and you are done. But remember that you do not need to store the contentto search it.

Re: Nutch content with Lucene search

Reply via email to