[ 
https://issues.apache.org/jira/browse/HDFS-16544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-16544.
-------------------------------------
    Fix Version/s: 3.4.0
                   3.2.4
                   3.3.4
         Assignee: qinyuren
       Resolution: Fixed

> EC decoding failed due to invalid buffer
> ----------------------------------------
>
>                 Key: HDFS-16544
>                 URL: https://issues.apache.org/jira/browse/HDFS-16544
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: erasure-coding
>            Reporter: qinyuren
>            Assignee: qinyuren
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.4.0, 3.2.4, 3.3.4
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> In [HDFS-16538|http://https//issues.apache.org/jira/browse/HDFS-16538] , we 
> found an EC file decoding bug if more than one data block read failed. 
> Currently, we found another bug trigger by #StatefulStripeReader.decode.
> If we read an EC file which {*}length more than one stripe{*}, and this file 
> have *one data block* and *the first parity block* corrupted, this error will 
> happen.
> {code:java}
> org.apache.hadoop.HadoopIllegalArgumentException: Invalid buffer found, not 
> allowing null    at 
> org.apache.hadoop.io.erasurecode.rawcoder.ByteBufferDecodingState.checkOutputBuffers(ByteBufferDecodingState.java:132)
>     at 
> org.apache.hadoop.io.erasurecode.rawcoder.ByteBufferDecodingState.<init>(ByteBufferDecodingState.java:48)
>     at 
> org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:86)
>     at 
> org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:170)
>     at 
> org.apache.hadoop.hdfs.StripeReader.decodeAndFillBuffer(StripeReader.java:435)
>     at 
> org.apache.hadoop.hdfs.StatefulStripeReader.decode(StatefulStripeReader.java:94)
>     at org.apache.hadoop.hdfs.StripeReader.readStripe(StripeReader.java:392)
>     at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:315)
>     at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:408)
>     at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:918) 
> {code}
>  
> Let's say we use ec(6+3) and the data block[0] and the first parity block[6] 
> are corrupted.
>  # The readers for block[0] and block[6] will be closed after reading the 
> first stripe of an EC file;
>  # When the client reading the second stripe of the EC file, it will trigger 
> #prepareParityChunk for block[6]. 
>  # The decodeInputs[6] will not be constructed because the reader for 
> block[6] was closed.
>  
> {code:java}
> boolean prepareParityChunk(int index) {
>   Preconditions.checkState(index >= dataBlkNum
>       && alignedStripe.chunks[index] == null);
>   if (readerInfos[index] != null && readerInfos[index].shouldSkip) {
>     alignedStripe.chunks[index] = new StripingChunk(StripingChunk.MISSING);
>     // we have failed the block reader before
>     return false;
>   }
>   final int parityIndex = index - dataBlkNum;
>   ByteBuffer buf = dfsStripedInputStream.getParityBuffer().duplicate();
>   buf.position(cellSize * parityIndex);
>   buf.limit(cellSize * parityIndex + (int) alignedStripe.range.spanInBlock);
>   decodeInputs[index] =
>       new ECChunk(buf.slice(), 0, (int) alignedStripe.range.spanInBlock);
>   alignedStripe.chunks[index] =
>       new StripingChunk(decodeInputs[index].getBuffer());
>   return true;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to