You are welcome. Glad it worked. Have fun. On Wed, Aug 3, 2011 at 4:16 PM, Kiks <[email protected]> wrote:
> That worked thanks to you and lewis. > > One thing that came up was I first tried to delete the old > /apache-solr-3.3.0/example/solr/data/index > by renaming it and creating a new directory but solr wouldn't start. > > After restoring the folder, changing solr schema.xml to > <field name="content" type="text" stored="true" indexed="true"/> > > and then re-running /bin/nutch solrindex... it was OK. > > > > On Wed, Aug 3, 2011 at 2:42 PM, Way Cool <[email protected]> wrote: > > > Potentially you need to make two changes: > > 1. As Lewis suggested, make sure to change the content field in > > solr/conf/schema.xml as below: > > <field name="content" type="text" stored="true" indexed="true"/> > > 2. Append the following as a part of search url: > > &hl=on&hl.fl=content site url title > > OR > > Add the following to solrconfig.xml as a part of browse search component > if > > you are using solr/browse: > > <str name="hl">on</str> > > <str name="hl.fl">url site title content</str> > > > > You should be able to see something like this when you search in Solr: > > <lst name="highlighting"> > > <lst name="http://thetechietutorials.blogspot.com/"><arr > > name="content"><str>, June 15, 2011 A Custom <em>Solr</em> Search > Component > > example - RedirectSearchComponent Currently Apache > > <em>Solr</em></str></arr></lst><lst name=" > > > > > http://thetechietutorials.blogspot.com/2011/06/working-example-of-java-annotations.html > > "><arr > > name="content"><str>) ▼ June (5) A working example of Java Annotations A > > Custom <em>Solr</em> Search Component example - > Redirect</str></arr></lst> > > ... > > </lst> > > > > You can also look at my blog about a customized solr browser interface > for > > Nutch data if you are interested. Here is the url: > > > > > http://thetechietutorials.blogspot.com/2011/07/customized-solr-browser-interface-for.html > > > > Thanks. > > > > On Wed, Aug 3, 2011 at 12:31 AM, Kiks <[email protected]> wrote: > > > > > This question was posted on solr list and not answered because nutch > > > related... > > > > > > > > > The indexed contents of 100 sites were imported to solr from nutch > using: > > > > > > bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb > > crawl/linkdb > > > crawl/segments/* > > > > > > now, a solr admin search for 'photography' includes these results: > > > > > > <doc> > > > <float name="score">0.12570743</ > > > float> > > > <float name="boost">1.0440307</float> > > > <str name="digest">94d97f2806240d18d67cafe9c34f94e1</str> > > > <str name="id">http://www.galleryhopper.org/</str> > > > <str name="segment">...</str> > > > <str name="title">Gallery Hopper: Todd Walker's photography > ephemera. > > > Read, enjoy, share, discard.</str> > > > <date name="tstamp">...</date> > > > <str name="url">http://www.galleryhopper.org/</str> > > > </doc> > > > > > > but highlighting options are on the title field not page text. > > > > > > My question: Where is the stored parsetext content of the pages? What > is > > > the > > > solr command to send it from nutch with url/id key? The information is > > > contained in the crawl segments with solr id field matching nutch url. > > > > > > Thanks. > > > > > >

