[ https://issues.apache.org/jira/browse/HDFS-16538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Takanobu Asanuma resolved HDFS-16538. ------------------------------------- Fix Version/s: 3.4.0 3.2.4 3.3.4 Assignee: qinyuren Resolution: Fixed > EC decoding failed due to not enough valid inputs > -------------------------------------------------- > > Key: HDFS-16538 > URL: https://issues.apache.org/jira/browse/HDFS-16538 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding > Reporter: qinyuren > Assignee: qinyuren > Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4, 3.3.4 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently, we found this error if the #StripeReader.readStripe() have more > than one block read failed. > We use the EC policy ec(6+3) in our cluster. > {code:java} > Caused by: org.apache.hadoop.HadoopIllegalArgumentException: No enough valid > inputs are provided, not recoverable > at > org.apache.hadoop.io.erasurecode.rawcoder.ByteBufferDecodingState.checkInputBuffers(ByteBufferDecodingState.java:119) > at > org.apache.hadoop.io.erasurecode.rawcoder.ByteBufferDecodingState.<init>(ByteBufferDecodingState.java:47) > at > org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:86) > at > org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:170) > at > org.apache.hadoop.hdfs.StripeReader.decodeAndFillBuffer(StripeReader.java:462) > at > org.apache.hadoop.hdfs.StatefulStripeReader.decode(StatefulStripeReader.java:94) > at > org.apache.hadoop.hdfs.StripeReader.readStripe(StripeReader.java:406) > at > org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:327) > at > org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:420) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:892) > at java.base/java.io.DataInputStream.read(DataInputStream.java:149) > at java.base/java.io.DataInputStream.read(DataInputStream.java:149) > {code} > > {code:java} > while (!futures.isEmpty()) { > try { > StripingChunkReadResult r = StripedBlockUtil > .getNextCompletedStripedRead(service, futures, 0); > dfsStripedInputStream.updateReadStats(r.getReadStats()); > DFSClient.LOG.debug("Read task returned: {}, for stripe {}", > r, alignedStripe); > StripingChunk returnedChunk = alignedStripe.chunks[r.index]; > Preconditions.checkNotNull(returnedChunk); > Preconditions.checkState(returnedChunk.state == StripingChunk.PENDING); > if (r.state == StripingChunkReadResult.SUCCESSFUL) { > returnedChunk.state = StripingChunk.FETCHED; > alignedStripe.fetchedChunksNum++; > updateState4SuccessRead(r); > if (alignedStripe.fetchedChunksNum == dataBlkNum) { > clearFutures(); > break; > } > } else { > returnedChunk.state = StripingChunk.MISSING; > // close the corresponding reader > dfsStripedInputStream.closeReader(readerInfos[r.index]); > final int missing = alignedStripe.missingChunksNum; > alignedStripe.missingChunksNum++; > checkMissingBlocks(); > readDataForDecoding(); > readParityChunks(alignedStripe.missingChunksNum - missing); > } {code} > This error can be trigger by #StatefulStripeReader.decode. > The reason is that: > # If there are more than one *data block* read failed, the > #readDataForDecoding will be called multiple times; > # The *decodeInputs array* will be initialized repeatedly. > # The *parity* *data* in *decodeInputs array* which filled by > #readParityChunks previously will be set to null. > > > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org