One way I have seen this working is to edit the schema.xml file
{SOLR_HOME}/conf/schema.xml. Modify the field with name "content" to have
its "stored" parameter set to "true". Something like this:<field name="content" type="text" *stored="true"* ..... You will need to re-index pages (either by emptying solr and deleting the crawl directory for nutch, or re-crawling the page when it has timed out) for this to take effect; new pages will have their content stored automatically. Hope this helps Chris On 20 July 2011 04:41, Kelvin <[email protected]> wrote: > Dear all, > > I have used both nutch 1.2 and 1.3. Both work fine for the crawling, > indexing. When I want to search using some keywords, it return the results, > showing snippets of the htmls that contain the keywords. Is there a way to > retrieve or access the full original html pages that contain the keywords? > > Thank you for your help. >

