Hi Sami, The schema.xml file there is usable only when using Solr as the search > server. Are you using Solr? >
Not yet! thanks for clarifying it. Cheers, Pedro. > -- > Sami Siren > > > Pedro Bezunartea López wrote: > > Hi, > >> >> I've developed a web application in lucene that searches web pages using a >> nutch generated index. I'd like to highlight the query searched for when >> showing the results, and I understand that the content of the pages need >> to >> be stored, as well as indexed. >> >> This is what I've tried so far: >> 1.- In the file conf/nutch-site.xml, I changed the value of >> "file.content.ignored" to false. >> 2.- In the file conf/schema.xml I modified the line: >> <field name="content" type="text" stored="false" indexed="true"/> >> to >> <field name="content" type="text" stored="true" indexed="true"/> >> 3.- In the sources file >> >> src/plugin/index-basic/src/java/org/apache/nutch/indexer/basic/BasicIndexingFilter.java, >> line 116 to: >> LuceneWriter.addFieldOptions("content", LuceneWriter.STORE.YES, >> LuceneWriter.INDEX.TOKENIZED, conf) >> >> I tried running the command "bin/nutch crawl urls -dir crawl -depth 10 >> -topN >> 5000" after the first two steps, but the crawl didn't store the contents. >> I >> then tried the third step, recompiled nutch, and run the crawl command >> again >> to no avail. >> >> What am I missing? Any hints, please? >> >> TIA, >> >> Pedro. >> >> >