Hi - if you check Solr's logs, you'll notice a problem in Lucene docvalues that 
has been solved in 5.5.
M.

 
 
-----Original message-----
> From:Kshitij Shukla <[email protected]>
> Sent: Tuesday 9th February 2016 8:00
> To: [email protected]
> Subject: [CIS-CMMI-3] Unable to index id ... possible analysis error
> 
> Hello everyone,
> 
> I have added a set of seeds to crawl using this command
> *
> ./bin/crawl /largeSeeds 1 http://localhost:8983/solr/ddcd 4
> 
> *I encountered an error**in index phase, which says*
> 
> *"Error: 
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: 
> Exception writing document id 
> com.angelfire.www:http/rock/babesintoyland/ to the index; possible 
> analysis error."
> *
> *Where in the error(Exception writing document id *$ID* to the index; 
> possible analysis error.) $ID is one of the followings:
> 
> com.angelfire.www:http/rock/babesintoyland/
> com.classicbands.www:http/steppenwolf.html
> net.classiccat.www:http/albeniz_i/
> com.all-reviews.www:http/videos-2/multiplicity.htm
> com.donnathebuffalo.www:http/
> com.musicbizacademy.www:http/
> com.allrightnow.www:http/fws/
> com.blinddivine.www:http/
> edu.mit.shakespeare:http/tempest/full.html
> com.collegetermpapers.www:http/TermPapers/Music/Shostokovich.shtml
> 1.75.62.198:http/www1/sistine/0-Tour.html
> com.musicbizacademy.www:http/
> com.musicbizacademy.www:http/
> com.blinddivine.www:http/
> edu.mit.shakespeare:http/tempest/full.html
> com.collegetermpapers.www:http/TermPapers/Music/Shostokovich.shtml
> *
> 
> *Full stack trace of error is as follows
> ****************************LOG START**********************
> *16/02/08 20:54:51 INFO mapreduce.Job: Task Id : 
> attempt_1454932871058_0013_m_000003_2, Status : FAILED
> Error: 
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: 
> Exception writing document id 
> com.collegetermpapers.www:http/TermPapers/Music/Shostokovich.shtml to 
> the index; possible analysis error.
>      at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:491)
>      at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:197)
>      at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
>      at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
>      at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
>      at 
> org.apache.nutch.indexwriter.solr.SolrIndexWriter.write(SolrIndexWriter.java:84)
>      at org.apache.nutch.indexer.IndexWriters.write(IndexWriters.java:84)
>      at 
> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:48)
>      at 
> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:43)
>      at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
>      at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>      at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>      at 
> org.apache.nutch.indexer.IndexingJob$IndexerMapper.map(IndexingJob.java:120)
>      at 
> org.apache.nutch.indexer.IndexingJob$IndexerMapper.map(IndexingJob.java:69)
>      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>      at java.security.AccessController.doPrivileged(Native Method)
>      at javax.security.auth.Subject.doAs(Subject.java:422)
>      at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> 
> 16/02/08 20:54:53 INFO mapreduce.Job: Task Id : 
> attempt_1454932871058_0013_m_000000_2, Status : FAILED
> Error: 
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: 
> Exception writing document id 1.75.62.198:http/www1/sistine/0-Tour.html 
> to the index; possible analysis error.
>      at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:491)
>      at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:197)
>      at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
>      at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
>      at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
>      at 
> org.apache.nutch.indexwriter.solr.SolrIndexWriter.write(SolrIndexWriter.java:84)
>      at org.apache.nutch.indexer.IndexWriters.write(IndexWriters.java:84)
>      at 
> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:48)
>      at 
> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:43)
>      at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
>      at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>      at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>      at 
> org.apache.nutch.indexer.IndexingJob$IndexerMapper.map(IndexingJob.java:120)
>      at 
> org.apache.nutch.indexer.IndexingJob$IndexerMapper.map(IndexingJob.java:69)
>      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>      at java.security.AccessController.doPrivileged(Native Method)
>      at javax.security.auth.Subject.doAs(Subject.java:422)
>      at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> 
> 16/02/08 20:54:55 INFO mapreduce.Job:  map 100% reduce 0%
> 16/02/08 20:54:55 INFO mapreduce.Job: Job job_1454932871058_0013 failed 
> with state FAILED due to: Task failed task_1454932871058_0013_m_000004
> Job failed as tasks failed. failedMaps:1 failedReduces:0
> 
> 16/02/08 20:54:55 INFO mapreduce.Job: Counters: 14
>      Job Counters
>          Failed map tasks=19
>          Killed map tasks=4
>          Launched map tasks=23
>          Other local map tasks=17
>          Data-local map tasks=2
>          Rack-local map tasks=4
>          Total time spent by all maps in occupied slots (ms)=2182584
>          Total time spent by all reduces in occupied slots (ms)=0
>          Total time spent by all map tasks (ms)=1091292
>          Total vcore-seconds taken by all map tasks=1091292
>          Total megabyte-seconds taken by all map tasks=4469932032
>      Map-Reduce Framework
>          CPU time spent (ms)=0
>          Physical memory (bytes) snapshot=0
>          Virtual memory (bytes) snapshot=0
> 16/02/08 20:54:55 ERROR indexer.IndexingJob: SolrIndexerJob: 
> java.lang.RuntimeException: job failed: name=[1]Indexer, 
> jobid=job_1454932871058_0013
>      at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)
>      at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:154)
>      at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:176)
>      at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:202)
>      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>      at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:211)
>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>      at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>      at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>      at java.lang.reflect.Method.invoke(Method.java:483)
>      at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> 
> Error running:
>    /home/c1/apache-nutch-2.3.1/runtime/deploy/bin/nutch index -D 
> mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D 
> mapred.reduce.tasks.speculative.execution=false -D 
> mapred.map.tasks.speculative.execution=false -D 
> mapred.compress.map.output=true -D 
> solr.server.url=http://ns613.mycyberhosting.com:8983/solr/ddcds -all 
> -crawlId 1
> Failed with exit value 255.
> ****************************LOG END**********************
> 
> *Please advise.
> BR
> 
> -- 
> 
> ------------------------------
> 
> *Cyber Infrastructure (P) Limited, [CIS] **(CMMI Level 3 Certified)*
> 
> Central India's largest Technology company.
> 
> *Ensuring the success of our clients and partners through our highly 
> optimized Technology solutions.*
> 
> www.cisin.com | +Cisin <https://plus.google.com/+Cisin/> | Linkedin 
> <https://www.linkedin.com/company/cyber-infrastructure-private-limited> | 
> Offices: *Indore, India.* *Singapore. Silicon Valley, USA*.
> 
> DISCLAIMER:  INFORMATION PRIVACY is important for us, If you are not the 
> intended recipient, you should delete this message and are notified that 
> any disclosure, copying or distribution of this message, or taking any 
> action based on it, is strictly prohibited by Law.
> 

Reply via email to