[ 
https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149869#comment-16149869
 ] 

Kai Zheng commented on HDFS-11882:
----------------------------------

Thanks [~andrew.wang] for adding so many comments in the codes which is very 
helpful for understanding the complex logic. Some minor comments, please check 
if they make sense or not.

1. How about {{waitCreatingNewStreams}} => {{waitCreatingStreamers}}, like we 
have checkStreamerUpdates.
2. "Get the acked file length" => "Get the length of the acked bytes in the 
block group"; "A full stripe is acked when at least numDataBlocks streamers 
have that cell" => "... streamers have corresponding cells of the stripe"; 
About "Parity cells are the length of the longest data cells", didn't quite 
follow and could you clarify some bit? 
{code}
   /**
-   * Get the number of acked stripes. An acked stripe means at least data block
-   * number size cells of the stripe were acked.
+   * Get the acked file length.
+   *
+   * <p>
+   *   A full stripe is acked when at least numDataBlocks streamers have
+   *   that cell, and all previous full stripes are also acked. This enforces
+   *   the constraint that there is at most one partial stripe.
+   * </p>
+   * <p>
+   *   Partial stripes write all parity cells. Empty data cells are not 
written.
+   *   Parity cells are the length of the longest data cells.
+   *   To be considered acked, a partial stripe needs at least numDataBlocks
+   *   empty or written cells.
+   * </p>
+   * <p>
+   *   Currently, partial stripes can only happen when closing the file at a
+   *   non-stripe boundary, but this could also happen during (currently
+   *   unimplemented) hflush/hsync support.
+   * </p>
    */
-  private long getNumAckedStripes() {
-    int minStripeNum = Integer.MAX_VALUE;
+  private long getAckedLength() {
{code}

May post more later today.

> Client fails if acknowledged size is greater than bytes sent
> ------------------------------------------------------------
>
>                 Key: HDFS-11882
>                 URL: https://issues.apache.org/jira/browse/HDFS-11882
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: erasure-coding, test
>            Reporter: Akira Ajisaka
>            Assignee: Andrew Wang
>            Priority: Critical
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, 
> HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.05.patch, 
> HDFS-11882.regressiontest.patch
>
>
> Some tests of erasure coding fails by the following exception. The following 
> test was removed by HDFS-11823, however, this type of error can happen in 
> real cluster.
> {noformat}
> Running 
> org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure)
>   Time elapsed: 38.831 sec  <<< ERROR!
> java.lang.IllegalStateException: null
>       at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
>       at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780)
>       at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664)
>       at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
>       at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472)
>       at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381)
>       at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to