[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154320#comment-16154320 ] Andrew Wang commented on HDFS-11882: Thanks for reviewing Kai! I gave the failed tests a look too and they seem unrelated to this patch. I'm talking with some other HDFS devs here, we're going to put some priority on fixing these since it's really bad right now. Will commit shortly. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Andrew Wang >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.05.patch, > HDFS-11882.06.patch, HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151995#comment-16151995 ] Kai Zheng commented on HDFS-11882: -- Thanks Andrew for the update! It LGTM. I looked at the failed cases and saw quite some not seen elsewhere before, like the one mentioned in HDFS-12388. Not sure where they're from. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Andrew Wang >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.05.patch, > HDFS-11882.06.patch, HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151273#comment-16151273 ] Hadoop QA commented on HDFS-11882: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 12s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 53s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}138m 32s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 | | | hadoop.hdfs.TestClientProtocolForPipelineRecovery | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 | | | hadoop.hdfs.TestEncryptedTransfer | | | hadoop.hdfs.tools.TestStoragePolicyCommands | | | hadoop.hdfs.TestFileAppendRestart | | | hadoop.hdfs.tools.TestDFSAdminWithHA | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure020 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 | | | hadoop.hdfs.TestLeaseRecoveryStriped | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure040 | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestReadStripedFileWithDecoding | | |
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150716#comment-16150716 ] Kai Zheng commented on HDFS-11882: -- Thanks [~andrew.wang] for the explanation. Got it. Suggest a minor change of the comment text: "Parity cells are of the length of the longest data cells" Another message clarifying: "Full stripe length can't be greater than file length" => "... greater than the block group length" In the new function you have two similar {{for}} blocks. It's possible to refactor the codes and just share the same one if you would save block numBytes in an array (instead of the ArrayList) and reuse the array. {code} +for (int i = 0; i < numAllBlocks; i++) { + final StripedDataStreamer streamer = getStripedDataStreamer(i); + if (streamer.isHealthy()) { +if (streamer.getBlock() != null) { + final long numBytes = streamer.getBlock().getNumBytes(); + if (numBytes == expectedBlockLengths[i]) { +numBlocksWithCorrectLength++; + } +} + } +} + {code} > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Andrew Wang >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.05.patch, > HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149881#comment-16149881 ] Andrew Wang commented on HDFS-11882: Thanks for taking a look Kai, I can update the patch tomorrow once you've had time to fully digest the patch. bq. About "Parity cells are the length of the longest data cells", didn't quite follow and could you clarify some bit? When there's a partially written stripe, we might have data lengths [10, 5, 0] and parity lengths [10, 10]. The parity cells are the length of the longest data cell (10). There could be multiple full data cells, so I made it plural. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Andrew Wang >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.05.patch, > HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149869#comment-16149869 ] Kai Zheng commented on HDFS-11882: -- Thanks [~andrew.wang] for adding so many comments in the codes which is very helpful for understanding the complex logic. Some minor comments, please check if they make sense or not. 1. How about {{waitCreatingNewStreams}} => {{waitCreatingStreamers}}, like we have checkStreamerUpdates. 2. "Get the acked file length" => "Get the length of the acked bytes in the block group"; "A full stripe is acked when at least numDataBlocks streamers have that cell" => "... streamers have corresponding cells of the stripe"; About "Parity cells are the length of the longest data cells", didn't quite follow and could you clarify some bit? {code} /** - * Get the number of acked stripes. An acked stripe means at least data block - * number size cells of the stripe were acked. + * Get the acked file length. + * + * + * A full stripe is acked when at least numDataBlocks streamers have + * that cell, and all previous full stripes are also acked. This enforces + * the constraint that there is at most one partial stripe. + * + * + * Partial stripes write all parity cells. Empty data cells are not written. + * Parity cells are the length of the longest data cells. + * To be considered acked, a partial stripe needs at least numDataBlocks + * empty or written cells. + * + * + * Currently, partial stripes can only happen when closing the file at a + * non-stripe boundary, but this could also happen during (currently + * unimplemented) hflush/hsync support. + * */ - private long getNumAckedStripes() { -int minStripeNum = Integer.MAX_VALUE; + private long getAckedLength() { {code} May post more later today. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Andrew Wang >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.05.patch, > HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at >
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16148330#comment-16148330 ] Kai Zheng commented on HDFS-11882: -- Travel today. Please expect slow response. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Andrew Wang >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.05.patch, > HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16148329#comment-16148329 ] Andrew Wang commented on HDFS-11882: Amazingly, I think these are all flakes. HDFS-12360 tracks the TestLeaseRecoveryStriped failure. I filed HDFS-12377 to fix the timeouts for TestReadStripedFileWhileDecoding The various WithFailure tests failed on testBlockTokenExpired, which I can reproduce locally and is showing in other precommit runs, and it's not the error we're trying to fix here. TestDataNodeHotswap, TestPread, TestDirectoryScanner, TestDataNodeVolumeFailureReporting are all existing flakies. I can fix the checkstyle at commit time, any other review comments? > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Andrew Wang >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.05.patch, > HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146520#comment-16146520 ] Hadoop QA commented on HDFS-11882: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 33s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 40s{color} | {color:orange} hadoop-hdfs-project: The patch generated 1 new + 8 unchanged - 0 fixed = 9 total (was 8) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 11s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 43s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}139m 34s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 | | | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.TestPread | | | hadoop.hdfs.TestClientProtocolForPipelineRecovery | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure050 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | | | hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 | | | hadoop.hdfs.TestLeaseRecoveryStriped | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestReadStripedFileWithDecoding | | | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 | | Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-11882 | | JIRA
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134740#comment-16134740 ] Kai Zheng commented on HDFS-11882: -- Exactly calculating the number of acked bytes would be perfect. Thanks [~andrew.wang] & [~ajisakaa] for driving to this! {code} - private long getNumAckedStripes() { -int minStripeNum = Integer.MAX_VALUE; + private long getAckedLength() { {code} > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Andrew Wang >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16131801#comment-16131801 ] Akira Ajisaka commented on HDFS-11882: -- Umm. The test failure is related: https://builds.apache.org/job/PreCommit-HDFS-Build/20741/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure150/test1/ Would you check this? > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Andrew Wang >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16131192#comment-16131192 ] Hadoop QA commented on HDFS-11882: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 33s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 58s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}152m 40s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations | | | hadoop.hdfs.TestEncryptionZones | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 | | | hadoop.hdfs.server.datanode.TestDataNodeUUID | | | hadoop.hdfs.server.datanode.checker.TestThrottledAsyncCheckerTimeout | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | Timed out junit tests | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-11882 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12882396/HDFS-11882.04.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 3f87bcd8c838 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / dd7916d | | Default Java |
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128282#comment-16128282 ] Akira Ajisaka commented on HDFS-11882: -- Thanks [~andrew.wang] for the update! Mostly looks good to me. Minor nit: {code} final long partialStripeBlockLength = (fullStripeLength / numDataBlocks) + partialCellSize; {code} I prefer {{numFullStripes * cellSize}} rather than {{fullStripeLength / numDataBlocks}}. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128091#comment-16128091 ] Andrew Wang commented on HDFS-11882: I can fix the checkstyles in the next patch, anyone up for a review? > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122680#comment-16122680 ] Hadoop QA commented on HDFS-11882: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 32s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 28s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 43s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 9 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 42s{color} | {color:orange} hadoop-hdfs-project: The patch generated 5 new + 8 unchanged - 0 fixed = 13 total (was 8) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 13s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 10s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}118m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-11882 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12881353/HDFS-11882.03.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux d4fd13e59a5a 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a32e013 | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/20651/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html | | findbugs |
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122621#comment-16122621 ] Kai Zheng commented on HDFS-11882: -- Thanks [~ajisakaa] for the clarifying. I guessed the wrong EC schema :(. bq. If DN9 and 10 are failing, getNumAckedStripes() will return 2, and ackedBytes will become 2 * 10 cells. (The number 10 comes from 10 + 4 schema) The block layouts as follows. So in this case, the returned acked stripes {{2}} is correct, but then the acked bytes calculated as 2 * 10 was wrong, because it didn't reflect the fact that in the 1st stripe actually there were 2 data cells located in D9~10 failed. Am I correct? {noformat} D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 D14 D D D D D D D D D D P P P P D D D D D D D D P P P P {noformat} If above sounds good, then {{getNumAckedStripes}} should be replaced with something like {{getNumAckedCells}} because the latter will be more accurate. I haven't looked into the provided patches here yet and just wanted to understand the question first: **why acked bytes size is greater than the bytes sent**, like asked by [~zhz] previously. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.03.patch, HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122397#comment-16122397 ] Andrew Wang commented on HDFS-11882: I spent some time digging into this, and I think I understand it better. The last stripe can be a partial stripe. If the partial stripe happens to have enough data cells, it counts as an acked stripe (i.e., {{numDataBlock}} streamers at that length). Then it multiplies by the # bytes in a stripe, which can round up the numAckedBytes above the sentBytes. This partial stripe issue only applies to close. IIUC, we pad out the last data cell, and write all the parity cells. Empty cells are assumed to be zero, and count toward the minimum durability threshold of {{numDataBlock}} streamers. Besides close, we're always writing full stripes. To be more concrete, imagine we are doing RS(6,3), and the last stripe looks like this: {noformat} x = full cell |d1|d2|d3|d4|d5|d6|p1|p2|p3| |x |x |x | | | |x |x |x | {noformat} For this partial stripe, 6 cells have data, which satisfies the {{numDataBlocks}} threshold. {noformat} |d1|d2|d3|d4|d5|d6|p1|p2|p3| |x | | | | | |x |x |x | {noformat} For this partial stripe, 4 cells have data, which fails the {{numDataBlocks}} threshold. Also, because there are supposed to be 5 empty cells, we only need one written cell to satisfy the durability requirement. As an example, for a data length of one cell, any of these would be fine: {noformat} |d1|d2|d3|d4|d5|d6|p1|p2|p3| |x | | | | | | | | | | | | | | | | |x | | {noformat} Because this last stripe needs to be handled specially on close, I don't think the current proposed patch fully addresses the issue. We also should try to address this related TODO: {noformat} // TODO we can also succeed if all the failed streamers have not taken // the updated block {noformat} I'm working on a patch to rework this code, but it's pretty complex, and I wanted to post my thinking here first. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at >
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121568#comment-16121568 ] Akira Ajisaka commented on HDFS-11882: -- Thanks [~drankye] for the reply. The used schema is 10 + 4, and it sent 1 full stripe and 1 partial stripe. When sent bytes are 18 cells, DNs receive 26 cells in total (including partiy blocks) as follows: * DN1~8: 2 cells (data blocks) * DN 9, 10: 1 cell (data blocks) * DN11~14: 2 cells (parity blocks) If DN9 and 10 are failing, getNumAckedStripes() will return 2, and ackedBytes will become 2 * 10 cells. (The number 10 comes from *10* + 4 schema) > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121110#comment-16121110 ] Kai Zheng commented on HDFS-11882: -- Hi [~ajisakaa], I'm trying to understand what you meant. bq. When sentBytes is 18*64k and the cellSize is 64k, The used schema should be 6 + 3. It sent 2 full strips of 2 x 9 x 64k bytes. bq. DN1~8 will have two 64k data blocks, DN9~10 will have one 64k data block, and DN11~14 will have two 64k parity blocks. In the schema of 6 + 3, it only needs to write to 9 DNs. But here it looks like to write DN1~14 or 14 DNs, why? I'm pretty confused here. bq. In this situation, getNumAckedStriples() will return 2 if DN9 and DN10 are failing. OK, if it return 2 that means the acked strips number is 2, so the acked bytes should be 2 x 9 x 64k, which should be right equal to the sent bytes. bq. That way, in the testcase ackedBytes will become 20*64k, which is greater than sentBytes. Looks like 2 extra cells in addition to the 2 full strips also acked, since DN9 and DN10 are failing, they shouldn't contribute to the 2 extra acked cells. I'm also trying to understand the root cause, why the acked bytes are greater than the sent bytes. Cloud you help explain a little bit for me? Thanks! > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119199#comment-16119199 ] Wei-Chiu Chuang commented on HDFS-11882: IMO this is an EC must do item rather than a nice-to-have one. The 02 patch looks good to me. The test in the patch could be simplified by removing the catch block, and let the exception be thrown. But i understand it is to keep consistency within other tests in the same class. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118204#comment-16118204 ] Akira Ajisaka commented on HDFS-11882: -- Ping [~andrew.wang] and [~zhz]. I could reproduce the failure in the regression test in the 02 patch. Would you review it? > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, > HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16104229#comment-16104229 ] Hadoop QA commented on HDFS-11882: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 59s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 12s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 59s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 47s{color} | {color:orange} hadoop-hdfs-project: The patch generated 2 new + 8 unchanged - 0 fixed = 10 total (was 8) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 21s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 41s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}112m 1s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDecommission | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-11882 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879135/HDFS-11882.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 175ae5d78318 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e3c7300 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/20446/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html | | findbugs |
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073758#comment-16073758 ] Hadoop QA commented on HDFS-11882: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 32s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 47s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 38s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 55s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}120m 37s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure040 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure050 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure190 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure020 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-11882 | | JIRA Patch URL |
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073598#comment-16073598 ] Akira Ajisaka commented on HDFS-11882: -- Now I added the test case to TestDFSStripedOutputStreamWithFailure, however, it's not good because the test case will be run in the existing subclasses. I'm not sure where to add. Now I'm thinking it's good to add a subclass to ignore the test case and modify the other existing subclasses to extend the subclass. That way we can easily add test cases to TestDFSStripledOuputStream without any effect on the existing subclasses. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-11882.01.patch, HDFS-11882.regressiontest.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064748#comment-16064748 ] Akira Ajisaka commented on HDFS-11882: -- bq. With 6+3 scheme, numAllBlocks is 9, and the logic takes the minimum number of acked stripes among all 9 streamers. When 3 DNs are failing, the logic takes the minimum number of acked stripes among 6 streamers. Anyway, I'll add a unit test to reproduce this situation. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-11882.01.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063755#comment-16063755 ] Andrew Wang commented on HDFS-11882: I added the nice-to-have label to it, thanks Zhe. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-11882.01.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063592#comment-16063592 ] Zhe Zhang commented on HDFS-11882: -- Sorry I missed this ping earlier. bq. Although it says that "at least data block number size cells", the method doesn't check this. The current logic in {{getNumAckedStripes}} actually seems to get the number of acked full stripes. With 6+3 scheme, {{numAllBlocks}} is 9, and the logic takes the minimum number of acked stripes among all 9 streamers. So I don't really understand why {{ackedBytes}} could be greater than {{sentBytes}}. Agreed with [~andrew.wang] that a unit test reproducing the behavior would be ideal. Also, I don't have time to continue working on HDFS-9079 any more but I feel it is needed for the complex EC writer logic. [~andrew.wang] [~drankye] Would be great if you can take a look and see if it's worth pursuing. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-11882.01.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16041804#comment-16041804 ] Andrew Wang commented on HDFS-11882: Thanks for working on this Akira. I looked at this part of the code again and there are things that could be improved, and that I don't understand. {code} /** * Get the number of acked stripes. An acked stripe means at least data block * number size cells of the stripe were acked. */ private long getNumAckedStripes() { {code} Although it says that "at least data block number size cells", the method doesn't check this. I think this is okay though since the callers validate that there are enough healthy streamers, but the javadoc minimally should be updated, or additional checks added. [Walter's intent though was to only count full stripes|https://issues.apache.org/jira/browse/HDFS-9342?focusedCommentId=15072472=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15072472], but I don't understand why this is correct. When closing a file at a non-stripe boundary, the last stripe is necessarily not a full stripe. In this situation, shouldn't the length be (as getNumAckedStripes alludes) the stripe that has minimally numDataBlocks cells? What's the effect of truncating the file length to the last full stripe? Does this truncate the file? When updatePipeline is called while closing the file, I don't see any logic to rewrite that last partial stripe. I'm hoping some of these questions can be investigated via unit test. Ping [~walter.k.su] / [~zhz] for inputs. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: HDFS-11882.01.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16028143#comment-16028143 ] Akira Ajisaka commented on HDFS-11882: -- Thanks [~tasanuma0829] for the review! Updated the title and description. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: HDFS-11882.01.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org