[ https://issues.apache.org/jira/browse/HDFS-15315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096832#comment-17096832 ]
Anshuman Singh commented on HDFS-15315: --------------------------------------- There are 3 datanodes and 1 namenode. All the nodes remain alive. It is always reproducible. When I start indexing 100 documents per second on Solr, it works fine for 4-5 batches but then it starts failing. It is always reproducible if I start with 10000 documents committing at once. *These are the logs on hdfs namenode:* 2020-04-30 21:38:10,081 INFO namenode.FSNamesystem (FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK* blk_-9223372036854775584_10 14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file /XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd 2020-04-30 21:38:10,485 INFO namenode.FSNamesystem (FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK* blk_-9223372036854775584_10 14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file /XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd 2020-04-30 21:38:11,287 INFO namenode.FSNamesystem (FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK* blk_-9223372036854775584_10 14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file /XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd 2020-04-30 21:38:12,890 INFO namenode.FSNamesystem (FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK* blk_-9223372036854775584_10 14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file /XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd 2020-04-30 21:38:16,094 INFO namenode.FSNamesystem (FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK* blk_-9223372036854775584_10 14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file /XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd 2020-04-30 21:38:22,497 INFO namenode.FSNamesystem (FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK* blk_-9223372036854775584_10 14 is COMMITTED but not COMPLETE(numNodes= 3 >= minimum = 2) in file /XOR21EC/dummy/core_node2/data/index/_0_Lucene70_0.dvd *Solr stack trace:* Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:670) at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:684) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1599) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1594) at org.apache.solr.update.DirectUpdateHandler2.updateDocument(DirectUpdateHandler2.java:982) at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:971) at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:348) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:284) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:234) ... 68 more Caused by: java.io.IOException: Unable to close file because the last block does not have enough number of replicas. at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2519) at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:2480) at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2445) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) at java.io.FilterOutputStream.close(FilterOutputStream.java:159) at java.io.FilterOutputStream.close(FilterOutputStream.java:159) at org.apache.lucene.store.OutputStreamIndexOutput.close(OutputStreamIndexOutput.java:70) at org.apache.lucene.util.IOUtils.close(IOUtils.java:88) at org.apache.lucene.util.IOUtils.close(IOUtils.java:76) at org.apache.lucene.codecs.lucene70.Lucene70DocValuesConsumer.close(Lucene70DocValuesConsumer.java:97) at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$ConsumerAndSuffix.close(PerFieldDocValuesFormat.java:92) at org.apache.lucene.util.IOUtils.close(IOUtils.java:88) at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.close(PerFieldDocValuesFormat.java:234) at org.apache.lucene.util.IOUtils.close(IOUtils.java:88) at org.apache.lucene.util.IOUtils.close(IOUtils.java:76) at org.apache.lucene.index.DefaultIndexingChain.writeDocValues(DefaultIndexingChain.java:266) at org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:133) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:470) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:554) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:719) at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3204) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3449) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3414) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:701) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:93) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalCommit(DistributedUpdateProcessor.java:1929) at org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1905) at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:160) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:281) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:188) ... 42 more > IOException on close() when using Erasure Coding > ------------------------------------------------ > > Key: HDFS-15315 > URL: https://issues.apache.org/jira/browse/HDFS-15315 > Project: Hadoop HDFS > Issue Type: Bug > Components: 3.1.1, hdfs > Affects Versions: 3.1.1 > Environment: XOR-2-1-1024k policy on hadoop 3.1.1 with 3 datanodes > Reporter: Anshuman Singh > Priority: Major > > When using Erasure Coding policy on a directory, the replication factor is > set to 1. Solr fails in indexing documents with error - _java.io.IOException: > Unable to close file because the last block does not have enough number of > replicas._ It works fine without EC (with replication factor as 3.) It seems > to be identical to this issue. [ > https://issues.apache.org/jira/browse/HDFS-11486|https://issues.apache.org/jira/browse/HDFS-11486] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org