Hello Julien,

I picked today's build from your URL but the problem persists as reported 
earlier. Any more ideas on how to tackle this?


Cheers,

On Monday 17 May 2010 15:50:55 Julien Nioche wrote:
> Hi Markus,
> 
> This has been solved last week and is in the trunk of the SVN repository.
> The nightly build has just been fixed after the move to the TLP so the
> version you are using does not have the fix yet. Check
> http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ to get the latest
> build or check it out from SVN
> 
> J.
> 
> > Hi,
> >
> >
> > I've got a copy of the nutch-2010-05-11_04-34-41 nightly build because i
> > need
> > Tika to parse JPEG images and that would be in 1.1 as i read somewhere
> > [1].
> >
> > First i fetch only a single HTML page and send it to Solr as i did with
> > 1.0 but it fails now. Here's what Solr thinks of the request:
> >
> >
> > ---------------
> > May 17, 2010 2:25:32 PM org.apache.solr.common.SolrException log
> > SEVERE: org.apache.solr.common.SolrException: ERROR: multiple values
> > encountered for non multiValued copy field id: <URL HERE>
> >        at
> > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:26
> >0) at
> >
> > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateP
> >rocessorFactory.java:60) at
> >
> > org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateP
> >rocessorFactory.java:94) at
> >
> > org.apache.solr.update.processor.SignatureUpdateProcessorFactory$Signatur
> >eUpdateProcessor.processAdd(SignatureUpdateProcessorFactory.java:162) at
> > org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
> >        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
> >        at
> >
> > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Conten
> >tStreamHandlerBase.java:54) at
> >
> > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBa
> >se.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> > at
> >
> > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.jav
> >a:338) at
> >
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja
> >va:241) at
> >
> > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicat
> >ionFilterChain.java:235) at
> >
> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilte
> >rChain.java:206) at
> >
> > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve
> >.java:233) at
> >
> > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve
> >.java:191) at
> >
> > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:
> >128) at
> >
> > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:
> >102) at
> >
> > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.j
> >ava:109) at
> > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:28
> >6) at
> > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845
> >) at
> >
> > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(H
> >ttp11Protocol.java:583) at
> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> >        at java.lang.Thread.run(Thread.java:619)
> > ---------------
> >
> >
> > Well, this obviously is wrong. Although i am still using the old 1.0
> > schema.xml, it still isn't multiValued in the nightly build's schema.xml
> > file.
> >
> > Below Nutch's relevant log lines:
> >
> >
> > ---------------
> > 2010-05-17 14:25:31,776 INFO  solr.SolrMappingReader - source: content
> > dest:
> > content
> > 2010-05-17 14:25:31,776 INFO  solr.SolrMappingReader - source: site dest:
> > site
> > 2010-05-17 14:25:31,776 INFO  solr.SolrMappingReader - source: title
> > dest: title
> > 2010-05-17 14:25:31,776 INFO  solr.SolrMappingReader - source: host dest:
> > host
> > 2010-05-17 14:25:31,776 INFO  solr.SolrMappingReader - source: segment
> > dest:
> > segment
> > 2010-05-17 14:25:31,777 INFO  solr.SolrMappingReader - source: boost
> > dest: boost
> > 2010-05-17 14:25:31,777 INFO  solr.SolrMappingReader - source: digest
> > dest: digest
> > 2010-05-17 14:25:31,777 INFO  solr.SolrMappingReader - source: tstamp
> > dest: tstamp
> > 2010-05-17 14:25:31,777 INFO  solr.SolrMappingReader - source: url dest:
> > id 2010-05-17 14:25:31,777 INFO  solr.SolrMappingReader - source: url
> > dest: url
> > 2010-05-17 14:25:31,821 INFO  collection.CollectionManager -
> > Instantiating CollectionManager
> > 2010-05-17 14:25:31,822 INFO  collection.CollectionManager - initializing
> > CollectionManager
> > 2010-05-17 14:25:31,849 INFO  collection.CollectionManager - file has1
> > elements
> > 2010-05-17 14:25:32,474 WARN  mapred.LocalJobRunner - job_local_0001
> > org.apache.solr.common.SolrException: Bad Request
> >
> > Bad Request
> >
> > request: http://127.0.0.1:8080/solr/update?wt=javabin&version=1
> >        at
> >
> > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt
> >tpSolrServer.java:424) at
> >
> > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt
> >tpSolrServer.java:243) at
> >
> > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(Abstra
> >ctUpdateRequest.java:105) at
> > org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49) at
> > org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:74)
> >        at
> >
> > org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.
> >java:48) at
> > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
> >        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> >        at
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> > ---------------
> >
> > Because i still use my old 1.0 configuration files i get the following
> > warning
> > from Nutch but doesn't look like it's related to the Sorl integration:
> >
> > ---------------
> > 2010-05-17 14:34:11,529 WARN  conf.Configuration - DEPRECATED:
> > hadoop-site.xml
> > found in the classpath. Usage of hadoop-site.xml is deprecated. Instead
> > use core-site.xml, mapred-site.xml and hdfs-site.xml to override
> > properties of core-default.xml, mapred-default.xml and hdfs-default.xml
> > respectively ---------------
> >
> > Did i just stumble upon a regression in 1.1dev and should i file a bug or
> > could something else spoil the fun?
> >
> >
> >
> > [1]: http://lucene.472066.n3.nabble.com/Adding-jpeg-parser-to-nutch-
> > td710135.html<http://lucene.472066.n3.nabble.com/Adding-jpeg-parser-to-nu
> >tch-%0Atd710135.html>
> >
> > Cheers,
> >
> > Markus Jelsma - Technisch Architect - Buyways BV
> > http://www.linkedin.com/in/markus17
> > 050-8536620 / 06-50258350
> 

Markus Jelsma - Technisch Architect - Buyways BV
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Reply via email to