[ 
https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122397#comment-16122397
 ] 

Andrew Wang commented on HDFS-11882:
------------------------------------

I spent some time digging into this, and I think I understand it better.

The last stripe can be a partial stripe. If the partial stripe happens to have 
enough data cells, it counts as an acked stripe (i.e., {{numDataBlock}} 
streamers at that length). Then it multiplies by the # bytes in a stripe, which 
can round up the numAckedBytes above the sentBytes.

This partial stripe issue only applies to close. IIUC, we pad out the last data 
cell, and write all the parity cells. Empty cells are assumed to be zero, and 
count toward the minimum durability threshold of {{numDataBlock}} streamers. 
Besides close, we're always writing full stripes.

To be more concrete, imagine we are doing RS(6,3), and the last stripe looks 
like this:

{noformat}
x = full cell

|d1|d2|d3|d4|d5|d6|p1|p2|p3|
|x |x |x  |  |  |  |x |x |x | 
{noformat}

For this partial stripe, 6 cells have data, which satisfies the 
{{numDataBlocks}} threshold.

{noformat}
|d1|d2|d3|d4|d5|d6|p1|p2|p3|
|x |  |   |  |  |  |x |x |x | 
{noformat}

For this partial stripe, 4 cells have data, which fails the {{numDataBlocks}} 
threshold. 

Also, because there are supposed to be 5 empty cells, we only need one written 
cell to satisfy the durability requirement. As an example, for a data length of 
one cell, any of these would be fine:

{noformat}
|d1|d2|d3|d4|d5|d6|p1|p2|p3|
|x |  |   |  |  |  |  |  |  | 
|  |  |   |  |  |  |  |x |  | 
{noformat}

Because this last stripe needs to be handled specially on close, I don't think 
the current proposed patch fully addresses the issue.

We also should try to address this related TODO:

{noformat}
      // TODO we can also succeed if all the failed streamers have not taken
      // the updated block
{noformat}

I'm working on a patch to rework this code, but it's pretty complex, and I 
wanted to post my thinking here first.

> Client fails if acknowledged size is greater than bytes sent
> ------------------------------------------------------------
>
>                 Key: HDFS-11882
>                 URL: https://issues.apache.org/jira/browse/HDFS-11882
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: erasure-coding, test
>            Reporter: Akira Ajisaka
>            Assignee: Akira Ajisaka
>            Priority: Critical
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, 
> HDFS-11882.regressiontest.patch
>
>
> Some tests of erasure coding fails by the following exception. The following 
> test was removed by HDFS-11823, however, this type of error can happen in 
> real cluster.
> {noformat}
> Running 
> org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure)
>   Time elapsed: 38.831 sec  <<< ERROR!
> java.lang.IllegalStateException: null
>       at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
>       at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780)
>       at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664)
>       at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
>       at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472)
>       at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381)
>       at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to