Re: [Nutch-general] Nutch content with Lucene search

Brian Whitman Sat, 27 Jan 2007 10:39:04 -0800

On Jan 27, 2007, at 1:34 PM, Gilbert Groenendijk wrote:
> Hello,
>
> Today i created a simple index with nutch by command line. After  
> that i
> copied the index to the machine to use it with a lucene  
> envirionment, no
> Nutch. Fetching the URL and title works pretty good but how can i  
> get the
> content? if i tak a look in Luke, the field content is not stored or
> tokenized but when i look in nutch-default.xml and nutch-site.xml,  
> i have
> definied:
>
> <property>
>  <name>fetcher.store.content</name>
>  <value>true</value>
>  <description>If true, fetcher will store content.</description>
> </property>
>
> it doesn't seem to work, any idea's?



I'm pretty sure that just means to store content in the WebDB, not  
the Lucene index. The stored content in the WebDB is used for the  
cache and the search summary. The WebDB cannot be directly read by  
Lucene. You can write Java apps to work with the WebDB APi, fetching  
the content per page as needed. Or, you could use the OpenSearch  
servlet to pull out the summaries and cache per URI.




-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] Nutch content with Lucene search

Reply via email to