[
https://issues.apache.org/jira/browse/HDFS-15315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096832#comment-17096832
]
Anshuman Singh commented on HDFS-15315:
---------------------------------------
There are 3 datanodes and 1 namenode. All the nodes remain alive. It is always
reproducible.
When I start indexing 100 documents per second on Solr, it works fine for 4-5
batches but then it starts failing.
It is always reproducible if I start with 10000 documents committing at once.
*These are the logs on hdfs namenode:*
2020-04-30 21:38:10,081 INFO namenode.FSNamesystem
(FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK*
blk_-9223372036854775584_10
14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file
/XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd
2020-04-30 21:38:10,485 INFO namenode.FSNamesystem
(FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK*
blk_-9223372036854775584_10
14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file
/XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd
2020-04-30 21:38:11,287 INFO namenode.FSNamesystem
(FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK*
blk_-9223372036854775584_10
14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file
/XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd
2020-04-30 21:38:12,890 INFO namenode.FSNamesystem
(FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK*
blk_-9223372036854775584_10
14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file
/XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd
2020-04-30 21:38:16,094 INFO namenode.FSNamesystem
(FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK*
blk_-9223372036854775584_10
14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file
/XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd
2020-04-30 21:38:22,497 INFO namenode.FSNamesystem
(FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK*
blk_-9223372036854775584_10
14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file
/XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd
*Solr stack trace:*
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is
closed
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:670)
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:684)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1599)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1594)
at
org.apache.solr.update.DirectUpdateHandler2.updateDocument(DirectUpdateHandler2.java:982)
at
org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:971)
at
org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:348)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:284)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:234)
... 68 more
Caused by: java.io.IOException: Unable to close file because the last block
does not have enough number of replicas.
at
org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2519)
at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:2480)
at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2445)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
at
org.apache.lucene.store.OutputStreamIndexOutput.close(OutputStreamIndexOutput.java:70)
at org.apache.lucene.util.IOUtils.close(IOUtils.java:88)
at org.apache.lucene.util.IOUtils.close(IOUtils.java:76)
at
org.apache.lucene.codecs.lucene70.Lucene70DocValuesConsumer.close(Lucene70DocValuesConsumer.java:97)
at
org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$ConsumerAndSuffix.close(PerFieldDocValuesFormat.java:92)
at org.apache.lucene.util.IOUtils.close(IOUtils.java:88)
at
org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.close(PerFieldDocValuesFormat.java:234)
at org.apache.lucene.util.IOUtils.close(IOUtils.java:88)
at org.apache.lucene.util.IOUtils.close(IOUtils.java:76)
at
org.apache.lucene.index.DefaultIndexingChain.writeDocValues(DefaultIndexingChain.java:266)
at
org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:133)
at
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:470)
at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:554)
at
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:719)
at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3204)
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3449)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3414)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:701)
at
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:93)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalCommit(DistributedUpdateProcessor.java:1929)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1905)
at
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:160)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:281)
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:188)
... 42 more
> IOException on close() when using Erasure Coding
> ------------------------------------------------
>
> Key: HDFS-15315
> URL: https://issues.apache.org/jira/browse/HDFS-15315
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: 3.1.1, hdfs
> Affects Versions: 3.1.1
> Environment: XOR-2-1-1024k policy on hadoop 3.1.1 with 3 datanodes
> Reporter: Anshuman Singh
> Priority: Major
>
> When using Erasure Coding policy on a directory, the replication factor is
> set to 1. Solr fails in indexing documents with error - _java.io.IOException:
> Unable to close file because the last block does not have enough number of
> replicas._ It works fine without EC (with replication factor as 3.) It seems
> to be identical to this issue. [
> https://issues.apache.org/jira/browse/HDFS-11486|https://issues.apache.org/jira/browse/HDFS-11486]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]