Hi Markus,

This has been solved last week and is in the trunk of the SVN repository.
The nightly build has just been fixed after the move to the TLP so the
version you are using does not have the fix yet. Check
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ to get the latest
build or check it out from SVN

J.
-- 
DigitalPebble Ltd
http://www.digitalpebble.com

On 17 May 2010 14:26, Markus Jelsma <[email protected]> wrote:

> Hi,
>
>
> I've got a copy of the nutch-2010-05-11_04-34-41 nightly build because i
> need
> Tika to parse JPEG images and that would be in 1.1 as i read somewhere [1].
>
> First i fetch only a single HTML page and send it to Solr as i did with 1.0
> but it fails now. Here's what Solr thinks of the request:
>
>
> ---------------
> May 17, 2010 2:25:32 PM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: ERROR: multiple values
> encountered for non multiValued copy field id: <URL HERE>
>        at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:260)
>        at
>
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
>        at
>
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:94)
>        at
>
> org.apache.solr.update.processor.SignatureUpdateProcessorFactory$SignatureUpdateProcessor.processAdd(SignatureUpdateProcessorFactory.java:162)
>        at
> org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
>        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
>        at
>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>        at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>        at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>        at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>        at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>        at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>        at
>
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>        at
>
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>        at
>
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>        at
>
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>        at
>
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>        at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>        at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>        at
>
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>        at
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>        at java.lang.Thread.run(Thread.java:619)
> ---------------
>
>
> Well, this obviously is wrong. Although i am still using the old 1.0
> schema.xml, it still isn't multiValued in the nightly build's schema.xml
> file.
>
> Below Nutch's relevant log lines:
>
>
> ---------------
> 2010-05-17 14:25:31,776 INFO  solr.SolrMappingReader - source: content
> dest:
> content
> 2010-05-17 14:25:31,776 INFO  solr.SolrMappingReader - source: site dest:
> site
> 2010-05-17 14:25:31,776 INFO  solr.SolrMappingReader - source: title dest:
> title
> 2010-05-17 14:25:31,776 INFO  solr.SolrMappingReader - source: host dest:
> host
> 2010-05-17 14:25:31,776 INFO  solr.SolrMappingReader - source: segment
> dest:
> segment
> 2010-05-17 14:25:31,777 INFO  solr.SolrMappingReader - source: boost dest:
> boost
> 2010-05-17 14:25:31,777 INFO  solr.SolrMappingReader - source: digest dest:
> digest
> 2010-05-17 14:25:31,777 INFO  solr.SolrMappingReader - source: tstamp dest:
> tstamp
> 2010-05-17 14:25:31,777 INFO  solr.SolrMappingReader - source: url dest: id
> 2010-05-17 14:25:31,777 INFO  solr.SolrMappingReader - source: url dest:
> url
> 2010-05-17 14:25:31,821 INFO  collection.CollectionManager - Instantiating
> CollectionManager
> 2010-05-17 14:25:31,822 INFO  collection.CollectionManager - initializing
> CollectionManager
> 2010-05-17 14:25:31,849 INFO  collection.CollectionManager - file has1
> elements
> 2010-05-17 14:25:32,474 WARN  mapred.LocalJobRunner - job_local_0001
> org.apache.solr.common.SolrException: Bad Request
>
> Bad Request
>
> request: http://127.0.0.1:8080/solr/update?wt=javabin&version=1
>        at
>
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
>        at
>
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
>        at
>
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
>        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
>        at
> org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:74)
>        at
>
> org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
>        at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
>        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
>        at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> ---------------
>
> Because i still use my old 1.0 configuration files i get the following
> warning
> from Nutch but doesn't look like it's related to the Sorl integration:
>
> ---------------
> 2010-05-17 14:34:11,529 WARN  conf.Configuration - DEPRECATED:
> hadoop-site.xml
> found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use
> core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of
> core-default.xml, mapred-default.xml and hdfs-default.xml respectively
> ---------------
>
> Did i just stumble upon a regression in 1.1dev and should i file a bug or
> could something else spoil the fun?
>
>
>
> [1]: http://lucene.472066.n3.nabble.com/Adding-jpeg-parser-to-nutch-
> td710135.html<http://lucene.472066.n3.nabble.com/Adding-jpeg-parser-to-nutch-%0Atd710135.html>
>
> Cheers,
>
> Markus Jelsma - Technisch Architect - Buyways BV
> http://www.linkedin.com/in/markus17
> 050-8536620 / 06-50258350
>
>

Reply via email to