Gilbert Groenendijk wrote:
Thank you (and Brian) for your anwsers. I noticed this to, but i want to get the content with the java API with Lucene 2.0. If it is impossible, i have
to write some extensions for my current code but rather not. I guess the
problem is the unstored property. Any config property available for that?

On 1/27/07, Gal Nitzan <[EMAIL PROTECTED]> wrote:


1. Open your index in Luke

2. click on the documents tab

3. click on the next arrow to move to the first document

4. than click on the reconstruct button.

You shall see the content field data in the right pane

HTH

-----Original Message-----
From: Gilbert Groenendijk [mailto:[EMAIL PROTECTED]
Sent: Saturday, January 27, 2007 8:34 PM
To: [email protected]
Subject: Nutch content with Lucene search

Hello,

Today i created a simple index with nutch by command line. After that i
copied the index to the machine to use it with a lucene envirionment, no
Nutch. Fetching the URL and title works pretty good but how can i get the
content? if i tak a look in Luke, the field content is not stored or
tokenized but when i look in nutch-default.xml and nutch-site.xml, i have
definied:

<property>
<name>fetcher.store.content</name>
<value>true</value>
<description>If true, fetcher will store content.</description>
</property>

it doesn't seem to work, any idea's?


--
Gilbert Groenendijk
__________________________________________________




Just change the 72nd line in BasicIndexingFilter in index-basic plugin from

doc.add(new Field("content", parse.getText(), Field.Store.NO, Field.Index.TOKENIZED));

to

doc.add(new Field("content", parse.getText(), Field.Store.YES, Field.Index.TOKENIZED));


and you are done. But remember that you do not need to store the content to search it.

Reply via email to