It's something else NUTCH-1016 Strip UTF-8 non-character codepoints and add logging for SolrWriter
On Wednesday 14 December 2011 15:12:23 Lewis John Mcgibbney wrote: > Hi Remi, > > This is a compatibility issue with conflicting versions of Solrj [1] > > [1] > http://lucene.472066.n3.nabble.com/Invalid-version-or-the-data-in-not-in-j > avabin-format-td1460495.html > > On Wed, Dec 14, 2011 at 1:57 PM, remi tassing <[email protected]> wrote: > > Hello guys, > > > > After crawling with Nutch I tried pushing the index to Solr but it > > doesn't work. > > > > I'm using Nutch-1.2. Solr-3.4 & 3.5 don't work but 1.4 works well! > > > > $ bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb > > crawl/linkdb crawl/segments/* > > SolrIndexer: starting at 2011-12-14 15:36:15 > > java.io.IOException: Job failed! > > > > This is my nutch log: > > ... > > 011-12-14 15:37:36,762 INFO indexer.IndexingFilters - Adding > > org.apache.nutch.indexer.anchor.AnchorIndexingFilter > > 2011-12-14 15:37:36,810 INFO solr.SolrMappingReader - source: content > > dest: content > > 2011-12-14 15:37:36,810 INFO solr.SolrMappingReader - source: site dest: > > site > > 2011-12-14 15:37:36,810 INFO solr.SolrMappingReader - source: title > > dest: title > > 2011-12-14 15:37:36,810 INFO solr.SolrMappingReader - source: host dest: > > host > > 2011-12-14 15:37:36,810 INFO solr.SolrMappingReader - source: segment > > dest: segment > > 2011-12-14 15:37:36,810 INFO solr.SolrMappingReader - source: boost > > dest: boost > > 2011-12-14 15:37:36,810 INFO solr.SolrMappingReader - source: digest > > dest: digest > > 2011-12-14 15:37:36,810 INFO solr.SolrMappingReader - source: tstamp > > dest: tstamp > > 2011-12-14 15:37:36,810 INFO solr.SolrMappingReader - source: url dest: > > id 2011-12-14 15:37:36,810 INFO solr.SolrMappingReader - source: url > > dest: url 2011-12-14 15:37:37,454 WARN mapred.LocalJobRunner - > > job_local_0001 java.lang.RuntimeException: Invalid version or the data > > in not in 'javabin' format > > at > > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99) > > at > > org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(Bi > > naryResponseParser.java:39) at > > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt > > tpSolrServer.java:466) at > > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt > > tpSolrServer.java:243) at > > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(Abstra > > ctUpdateRequest.java:105) at > > org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49) at > > org.apache.nutch.indexer.solr.SolrWriter.write(SolrWriter.java:64) at > > org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat. > > java:54) at > > org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat. > > java:44) at > > org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:440) at > > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:15 > > 9) at > > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50 > > ) at > > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463) > > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411) > > at > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) > > 2011-12-14 15:37:38,160 ERROR solr.SolrIndexer - java.io.IOException: Job > > failed! > > > > Remi -- Markus Jelsma - CTO - Openindex

