Hello Julien,
I picked today's build from your URL but the problem persists as reported earlier. Any more ideas on how to tackle this? Cheers, On Monday 17 May 2010 15:50:55 Julien Nioche wrote: > Hi Markus, > > This has been solved last week and is in the trunk of the SVN repository. > The nightly build has just been fixed after the move to the TLP so the > version you are using does not have the fix yet. Check > http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ to get the latest > build or check it out from SVN > > J. > > > Hi, > > > > > > I've got a copy of the nutch-2010-05-11_04-34-41 nightly build because i > > need > > Tika to parse JPEG images and that would be in 1.1 as i read somewhere > > [1]. > > > > First i fetch only a single HTML page and send it to Solr as i did with > > 1.0 but it fails now. Here's what Solr thinks of the request: > > > > > > --------------- > > May 17, 2010 2:25:32 PM org.apache.solr.common.SolrException log > > SEVERE: org.apache.solr.common.SolrException: ERROR: multiple values > > encountered for non multiValued copy field id: <URL HERE> > > at > > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:26 > >0) at > > > > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateP > >rocessorFactory.java:60) at > > > > org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateP > >rocessorFactory.java:94) at > > > > org.apache.solr.update.processor.SignatureUpdateProcessorFactory$Signatur > >eUpdateProcessor.processAdd(SignatureUpdateProcessorFactory.java:162) at > > org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139) > > at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) > > at > > > > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Conten > >tStreamHandlerBase.java:54) at > > > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBa > >se.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) > > at > > > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.jav > >a:338) at > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja > >va:241) at > > > > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicat > >ionFilterChain.java:235) at > > > > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilte > >rChain.java:206) at > > > > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve > >.java:233) at > > > > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve > >.java:191) at > > > > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java: > >128) at > > > > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java: > >102) at > > > > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.j > >ava:109) at > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:28 > >6) at > > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845 > >) at > > > > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(H > >ttp11Protocol.java:583) at > > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) > > at java.lang.Thread.run(Thread.java:619) > > --------------- > > > > > > Well, this obviously is wrong. Although i am still using the old 1.0 > > schema.xml, it still isn't multiValued in the nightly build's schema.xml > > file. > > > > Below Nutch's relevant log lines: > > > > > > --------------- > > 2010-05-17 14:25:31,776 INFO solr.SolrMappingReader - source: content > > dest: > > content > > 2010-05-17 14:25:31,776 INFO solr.SolrMappingReader - source: site dest: > > site > > 2010-05-17 14:25:31,776 INFO solr.SolrMappingReader - source: title > > dest: title > > 2010-05-17 14:25:31,776 INFO solr.SolrMappingReader - source: host dest: > > host > > 2010-05-17 14:25:31,776 INFO solr.SolrMappingReader - source: segment > > dest: > > segment > > 2010-05-17 14:25:31,777 INFO solr.SolrMappingReader - source: boost > > dest: boost > > 2010-05-17 14:25:31,777 INFO solr.SolrMappingReader - source: digest > > dest: digest > > 2010-05-17 14:25:31,777 INFO solr.SolrMappingReader - source: tstamp > > dest: tstamp > > 2010-05-17 14:25:31,777 INFO solr.SolrMappingReader - source: url dest: > > id 2010-05-17 14:25:31,777 INFO solr.SolrMappingReader - source: url > > dest: url > > 2010-05-17 14:25:31,821 INFO collection.CollectionManager - > > Instantiating CollectionManager > > 2010-05-17 14:25:31,822 INFO collection.CollectionManager - initializing > > CollectionManager > > 2010-05-17 14:25:31,849 INFO collection.CollectionManager - file has1 > > elements > > 2010-05-17 14:25:32,474 WARN mapred.LocalJobRunner - job_local_0001 > > org.apache.solr.common.SolrException: Bad Request > > > > Bad Request > > > > request: http://127.0.0.1:8080/solr/update?wt=javabin&version=1 > > at > > > > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt > >tpSolrServer.java:424) at > > > > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt > >tpSolrServer.java:243) at > > > > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(Abstra > >ctUpdateRequest.java:105) at > > org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49) at > > org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:74) > > at > > > > org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat. > >java:48) at > > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474) > > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411) > > at > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) > > --------------- > > > > Because i still use my old 1.0 configuration files i get the following > > warning > > from Nutch but doesn't look like it's related to the Sorl integration: > > > > --------------- > > 2010-05-17 14:34:11,529 WARN conf.Configuration - DEPRECATED: > > hadoop-site.xml > > found in the classpath. Usage of hadoop-site.xml is deprecated. Instead > > use core-site.xml, mapred-site.xml and hdfs-site.xml to override > > properties of core-default.xml, mapred-default.xml and hdfs-default.xml > > respectively --------------- > > > > Did i just stumble upon a regression in 1.1dev and should i file a bug or > > could something else spoil the fun? > > > > > > > > [1]: http://lucene.472066.n3.nabble.com/Adding-jpeg-parser-to-nutch- > > td710135.html<http://lucene.472066.n3.nabble.com/Adding-jpeg-parser-to-nu > >tch-%0Atd710135.html> > > > > Cheers, > > > > Markus Jelsma - Technisch Architect - Buyways BV > > http://www.linkedin.com/in/markus17 > > 050-8536620 / 06-50258350 > Markus Jelsma - Technisch Architect - Buyways BV http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

