[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone
[ https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046968#comment-17046968 ] Ahmed Hussein commented on HDFS-15147: -- Thanks [~kihwal] for committing the patches. > LazyPersistTestCase wait logic is error-prone > - > > Key: HDFS-15147 > URL: https://issues.apache.org/jira/browse/HDFS-15147 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1 > > Attachments: HDFS-15147-branch-2.10.001.patch, > HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, > HDFS-15147.003.patch > > > {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of > the test cases: > * the wait periods to change of status is too long. It reaches 10 secs in > some cases. > * triggerBlockReport() only triggers FBR of DN with index 0. This is counter > intuitive because the JUnit tests restart the DN assuming that the restarted > DN will send a FBR. However, this never happens because the DN will get a new > index post restart. > {code:java} > protected final void triggerBlockReport() > throws IOException, InterruptedException { > // Trigger block report to NN > DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0)); > Thread.sleep(10 * 1000); > } > {code} > [~inigoiri] suggested that we propagate the findings and fixes from > HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will > eventually reduce the runtime and make the test cases more stable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone
[ https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046783#comment-17046783 ] Kihwal Lee commented on HDFS-15147: --- It has been committed to trunk to branch-2.10. Thanks for working on the patch, Amed. Thanks for the review, [~elgoiri]. > LazyPersistTestCase wait logic is error-prone > - > > Key: HDFS-15147 > URL: https://issues.apache.org/jira/browse/HDFS-15147 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1 > > Attachments: HDFS-15147-branch-2.10.001.patch, > HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, > HDFS-15147.003.patch > > > {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of > the test cases: > * the wait periods to change of status is too long. It reaches 10 secs in > some cases. > * triggerBlockReport() only triggers FBR of DN with index 0. This is counter > intuitive because the JUnit tests restart the DN assuming that the restarted > DN will send a FBR. However, this never happens because the DN will get a new > index post restart. > {code:java} > protected final void triggerBlockReport() > throws IOException, InterruptedException { > // Trigger block report to NN > DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0)); > Thread.sleep(10 * 1000); > } > {code} > [~inigoiri] suggested that we propagate the findings and fixes from > HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will > eventually reduce the runtime and make the test cases more stable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone
[ https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046720#comment-17046720 ] Kihwal Lee commented on HDFS-15147: --- +1 for the 3.2 patch. One minor nit is that {{FSNamesystem}} already has the following: {code:java} import static org.apache.hadoop.util.Time.monotonicNow; {code} So you could have simply call {{monotonicNow()}} in the patch. But this isn't critical and in some sense it is more consistent with the trunk patch. > LazyPersistTestCase wait logic is error-prone > - > > Key: HDFS-15147 > URL: https://issues.apache.org/jira/browse/HDFS-15147 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15147-branch-2.10.001.patch, > HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, > HDFS-15147.003.patch > > > {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of > the test cases: > * the wait periods to change of status is too long. It reaches 10 secs in > some cases. > * triggerBlockReport() only triggers FBR of DN with index 0. This is counter > intuitive because the JUnit tests restart the DN assuming that the restarted > DN will send a FBR. However, this never happens because the DN will get a new > index post restart. > {code:java} > protected final void triggerBlockReport() > throws IOException, InterruptedException { > // Trigger block report to NN > DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0)); > Thread.sleep(10 * 1000); > } > {code} > [~inigoiri] suggested that we propagate the findings and fixes from > HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will > eventually reduce the runtime and make the test cases more stable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone
[ https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046024#comment-17046024 ] Hadoop QA commented on HDFS-15147: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 44s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} branch-3.2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 0s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 43s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 36s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 1s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 33s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 10s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 23s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 26s{color} | {color:green} branch-3.2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 54s{color} | {color:green} root generated 0 new + 1324 unchanged - 1 fixed = 1324 total (was 1325) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 30s{color} | {color:green} root: The patch generated 0 new + 448 unchanged - 14 fixed = 448 total (was 462) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 16s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 51s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}113m 49s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}252m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.diskbalancer.TestDiskBalancer | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:0f25cbbb251 | | JIRA Issue | HDFS-15147 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12994701/HDFS-15147-branch-3.2.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 054342709d65 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone
[ https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045879#comment-17045879 ] Ahmed Hussein commented on HDFS-15147: -- Thanks [~kihwal]! I uploaded a new patch for branch-3.2 > LazyPersistTestCase wait logic is error-prone > - > > Key: HDFS-15147 > URL: https://issues.apache.org/jira/browse/HDFS-15147 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15147-branch-2.10.001.patch, > HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, > HDFS-15147.003.patch > > > {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of > the test cases: > * the wait periods to change of status is too long. It reaches 10 secs in > some cases. > * triggerBlockReport() only triggers FBR of DN with index 0. This is counter > intuitive because the JUnit tests restart the DN assuming that the restarted > DN will send a FBR. However, this never happens because the DN will get a new > index post restart. > {code:java} > protected final void triggerBlockReport() > throws IOException, InterruptedException { > // Trigger block report to NN > DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0)); > Thread.sleep(10 * 1000); > } > {code} > [~inigoiri] suggested that we propagate the findings and fixes from > HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will > eventually reduce the runtime and make the test cases more stable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone
[ https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045646#comment-17045646 ] Kihwal Lee commented on HDFS-15147: --- I committed this to trunk, but it couldn't be cherry-picked to branch-3.2. Besides a couple of minor conflicts in FSNamesystem, {noformat} [ERROR] /home/kihwal/devel/apache/hadoop2/hadoop-hdfs-project/hadoop-hdfs/src/ test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyPersistTestCase.java:[159,35] cannot find symbol [ERROR] symbol: method seconds(int) [ERROR] location: class org.junit.rules.Timeout {noformat} trunk has junit 4.12 and branch-3.2 has 4.11. {{seconds(int)}} must be a new method in 4.12. [~ahussein], can you make a patch for branch-3.2? > LazyPersistTestCase wait logic is error-prone > - > > Key: HDFS-15147 > URL: https://issues.apache.org/jira/browse/HDFS-15147 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: HDFS-15147-branch-2.10.001.patch, HDFS-15147.001.patch, > HDFS-15147.002.patch, HDFS-15147.003.patch > > > {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of > the test cases: > * the wait periods to change of status is too long. It reaches 10 secs in > some cases. > * triggerBlockReport() only triggers FBR of DN with index 0. This is counter > intuitive because the JUnit tests restart the DN assuming that the restarted > DN will send a FBR. However, this never happens because the DN will get a new > index post restart. > {code:java} > protected final void triggerBlockReport() > throws IOException, InterruptedException { > // Trigger block report to NN > DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0)); > Thread.sleep(10 * 1000); > } > {code} > [~inigoiri] suggested that we propagate the findings and fixes from > HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will > eventually reduce the runtime and make the test cases more stable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org