On Thursday 10 May 2012 14:35:03 Lewis John Mcgibbney wrote:
> Hi Michael,
> 
> As I'm also not using most recent stable Solr distribution (3.6.0), I
> can only comment (maybe unwisely) that the most recent version of Solr
> that Nutch supports is maybe 3.4.0 as this is the dependency we pull
> with ivy. It also looks like Solr and Solrj are released in parallel
> so maybe try upgrading your solrj dependency if you wish to use Solr
> 3.6.0...

This should not be a version issue. We happily index from trunk or 1.4 to Solr 
versions > 3.0. There must be some schema thing or bad Solr request handler 
defined.

> 
> If the above is correct, then this is why 3.1.0 works fine when you
> roll back as I would imagine backwards compatibility is always of key
> importance.
> 
> I would be pleased to know that the above is not correct and that
> Nutch is above to index to Solr 3.6.0, however if not then maybe we
> should upgrade accordingly in trunk.
> 
> Thanks
> 
> Lewis
> 
> On Thu, May 10, 2012 at 1:56 PM, Michael Erickson
> 
> <[email protected]> wrote:
> > On May 10, 2012, at 1:42 AM, Markus Jelsma wrote:
> >> Hi,
> >> 
> >> On Thu, 10 May 2012 09:10:04 +0300, Tolga <[email protected]> wrote:
> >>> Hi,
> >>> 
> >>> This will sound like a duplicate, but actually it differs from the
> >>> other one. Please bear with me. Following
> >>> http://wiki.apache.org/nutch/NutchTutorial, I first issued the command
> >>> 
> >>> bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 5
> >>> 
> >>> Then when I got the message
> >>> 
> >>> Exception in thread "main" java.io.IOException: Job failed!
> >>>    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
> >>>    at
> >>> org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDupli
> >>> cates.java:373) at
> >>> org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDupli
> >>> cates.java:353) at org.apache.nutch.crawl.Crawl.run(Crawl.java:153)
> >>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>>    at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)
> >> 
> >> Please include the relevant part of the log. This can be a known issue.
> >> 
> >>> I issued the commands
> >>> 
> >>> bin/nutch crawl urls -dir crawl -depth 3 -topN 5
> >>> 
> >>> and
> >>> 
> >>> bin/nutch solrindex http://127.0.0.1:8983/solr/ crawldb -linkdb
> >>> crawldb/linkdb crawldb/segments/*
> >>> 
> >>> separately, after which I got no errors. When I browsed to
> >>> http://localhost:8983/solr/admin and attempted a search, I got the
> >>> error
> >>> 
> >>> 
> >>>   HTTP ERROR 400
> >>> 
> >>> Problem accessing /solr/select. Reason:
> >>> 
> >>>    undefined field text
> >> 
> >> But this is a Solr thing, you have no field named text. Resolve this in
> >> Solr or on the Solr mailing list.> 
> > I will say that I had similar issues last week when I tried the Nutch
> > tutorial.  I went to the #Solr IRC channel and got no response.  The
> > quick answer was that I had to go back to Solr version 3.1.0 for the
> > instructions in the Nutch tutorial to work.
> > 
> > The longer answer is that following the existing Nutch tutorial gave me
> > two errors.
> > 
> > 1) SolrDeleteDuplicates exception as mentioned by Tolga above.
> > 
> > To fix this I:
> > 
> > 1.a) Stop Solr.
> > 1.b) Delete Solr index.
> > 1.c) Copy the Nutch-provided schema.xml into the proper Solr directory
> > (example/solr/conf/). 1.d) Replace Nutch's solr-solrj-xxx.jar with the
> > appropriate version from Solr: ( solr/dist/apache-solr-solrj-xxx.jar  -->
> > nutch/runtime/local/lib/solr-solrj-xxx.jar ) 1.e) Restart Solr.
> > 
> > The first two steps may only be necessary if you had Solr running already
> > using the default schema that they provided as I did because I had done
> > the Solr tutorial first.
> > 
> > 2) The HTTP 400 Error "undefined field text" issue.
> > 
> > This appears to be the same as:
> > https://issues.apache.org/jira/browse/SOLR-3416.  Log output from Solr
> > output is here: http://pastebin.com/YWdPnXpv and the Nutch provided
> > schema is here: http://pastebin.com/LQDDKC5B
> > 
> > The only way I got this working was to move Solr from version 3.6.0 back
> > to version 3.1.0.
> > 
> > I'm *totally* new to Solr/Nutch, but I might suggest a versioning
> > mismatch?
> > 
> > 
> > Regards,
> > --mike
> > 
> > Michael Erickson
> > [email protected]
-- 
Markus Jelsma - CTO - Openindex

Reply via email to