Markus Jelsma created NUTCH-1601:
------------------------------------
Summary: ElasticSearchIndexer fails to properly delete documents
Key: NUTCH-1601
URL: https://issues.apache.org/jira/browse/NUTCH-1601
Project: Nutch
Issue Type: Bug
Components: indexer
Affects Versions: 1.7
Reporter: Markus Jelsma
Assignee: Markus Jelsma
Fix For: 1.8
Exception is thrown because the indexer does not properly set the type and
index for delete commands. This comes from the original source so 2x may be
affected as well.
{code}
ava.io.IOException
at
org.apache.nutch.indexwriter.elastic.ElasticIndexWriter.makeIOException(ElasticIndexWriter.java:173)
at
org.apache.nutch.indexwriter.elastic.ElasticIndexWriter.delete(ElasticIndexWriter.java:168)
at org.apache.nutch.indexer.IndexWriters.delete(IndexWriters.java:108)
at
org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:52)
at
org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41)
at
org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.write(ReduceTask.java:458)
at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:500)
at
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:203)
at
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:53)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:522)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
Caused by: org.elasticsearch.action.ActionRequestValidationException:
Validation Failed: 1: index is missing;2: type is missing;
at
org.elasticsearch.action.ValidateActions.addValidationError(ValidateActions.java:29)
at
org.elasticsearch.action.support.replication.ShardReplicationOperationRequest.validate(ShardReplicationOperationRequest.java:126)
at
org.elasticsearch.action.delete.DeleteRequest.validate(DeleteRequest.java:84)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:55)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:83)
at
org.elasticsearch.client.support.AbstractClient.delete(AbstractClient.java:121)
at
org.elasticsearch.action.delete.DeleteRequestBuilder.doExecute(DeleteRequestBuilder.java:147)
at
org.elasticsearch.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:53)
at
org.elasticsearch.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:47)
at
org.apache.nutch.indexwriter.elastic.ElasticIndexWriter.delete(ElasticIndexWriter.java:165)
... 10 more
2013-07-03 11:43:39,957 ERROR indexer.IndexingJob - Indexer:
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:123)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:185)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:195)
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira