[ 
https://issues.apache.org/jira/browse/NUTCH-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel updated NUTCH-3071:
-----------------------------------
    Component/s: wiki

> Tutorial for Intranet Document Search outdated
> ----------------------------------------------
>
>                 Key: NUTCH-3071
>                 URL: https://issues.apache.org/jira/browse/NUTCH-3071
>             Project: Nutch
>          Issue Type: Improvement
>          Components: documentation, wiki
>            Reporter: Hiran Chaudhuri
>            Priority: Minor
>
> On the page 
> [https://cwiki.apache.org/confluence/display/NUTCH/IntranetDocumentSearch] 
> the schema.xml file for Solr is claimed to be in the nutch conf directory. At 
> least in the current master branch that is no longer the case.
> Searching for a schema.xml I found something sutable at 
> src/plugin/indexer-solr/schema.xml, and this file is also mentioned in 
> [https://cwiki.apache.org/confluence/display/NUTCH/NutchTutorial#NutchTutorial-SetupSolrforsearch]
> Maybe the IntranetDocumentSearch should simply point to the 
> SetupSolrforsearch chapter.
>  
> But even following the SetupSolrforsearch does not help fully. When running 
> the command
> bin/nutch index crawl/crawldb/ -linkdb crawl/linkdb/ 
> crawl/segments/20131108063838/ -filter -normalize -deleteGone
> I am getting the message
> INFO o.a.n.i.IndexerOutputFormat [pool-5-thread-1] No IndexWriters activated 
> - check your configuration
>  
> So some step to modify the Nutch config files is missing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to