You are welcome. Glad it worked. Have fun.

On Wed, Aug 3, 2011 at 4:16 PM, Kiks <[email protected]> wrote:

> That worked thanks to you and lewis.
>
> One thing that came up was I first tried to delete the old
> /apache-solr-3.3.0/example/solr/data/index
> by renaming it and creating a new directory but solr wouldn't start.
>
> After restoring the folder, changing solr schema.xml to
> <field name="content" type="text" stored="true" indexed="true"/>
>
> and then re-running /bin/nutch solrindex... it was OK.
>
>
>
> On Wed, Aug 3, 2011 at 2:42 PM, Way Cool <[email protected]> wrote:
>
> > Potentially you need to make two changes:
> > 1. As Lewis suggested, make sure to change the content field in
> > solr/conf/schema.xml as below:
> > <field name="content" type="text" stored="true" indexed="true"/>
> > 2. Append the following as a part of search url:
> > &hl=on&hl.fl=content site url title
> > OR
> > Add the following to solrconfig.xml as a part of browse search component
> if
> > you are using solr/browse:
> >  <str name="hl">on</str>
> >  <str name="hl.fl">url site title content</str>
> >
> > You should be able to see something like this when you search in Solr:
> > <lst name="highlighting">
> > <lst name="http://thetechietutorials.blogspot.com/";><arr
> > name="content"><str>, June 15, 2011 A Custom <em>Solr</em> Search
> Component
> > example - RedirectSearchComponent Currently Apache
> > <em>Solr</em></str></arr></lst><lst name="
> >
> >
> http://thetechietutorials.blogspot.com/2011/06/working-example-of-java-annotations.html
> > "><arr
> > name="content"><str>) ▼  June (5) A working example of Java Annotations A
> > Custom <em>Solr</em> Search Component example -
> Redirect</str></arr></lst>
> > ...
> > </lst>
> >
> > You can also look at my blog about a customized solr browser interface
> for
> > Nutch data if you are interested. Here is the url:
> >
> >
> http://thetechietutorials.blogspot.com/2011/07/customized-solr-browser-interface-for.html
> >
> > Thanks.
> >
> > On Wed, Aug 3, 2011 at 12:31 AM, Kiks <[email protected]> wrote:
> >
> > > This question was posted on solr list and not answered because nutch
> > > related...
> > >
> > >
> > > The indexed contents of 100 sites were imported to solr from nutch
> using:
> > >
> > > bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb
> > crawl/linkdb
> > > crawl/segments/*
> > >
> > > now, a solr admin search for 'photography' includes these results:
> > >
> > >  <doc>
> > >    <float name="score">0.12570743</
> > > float>
> > >    <float name="boost">1.0440307</float>
> > >    <str name="digest">94d97f2806240d18d67cafe9c34f94e1</str>
> > >    <str name="id">http://www.galleryhopper.org/</str>
> > >    <str name="segment">...</str>
> > >    <str name="title">Gallery Hopper: Todd Walker's photography
> ephemera.
> > > Read, enjoy, share, discard.</str>
> > >    <date name="tstamp">...</date>
> > >    <str name="url">http://www.galleryhopper.org/</str>
> > >  </doc>
> > >
> > > but highlighting options are on the title field not page text.
> > >
> > > My question: Where is the stored parsetext content of the pages? What
> is
> > > the
> > > solr command to send it from nutch with url/id key? The information is
> > > contained in the crawl segments with solr id field matching nutch url.
> > >
> > > Thanks.
> > >
> >
>

Reply via email to