[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone

2020-02-27 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046968#comment-17046968
 ] 

Ahmed Hussein commented on HDFS-15147:
--

Thanks [~kihwal] for committing the patches.

> LazyPersistTestCase wait logic is error-prone
> -
>
> Key: HDFS-15147
> URL: https://issues.apache.org/jira/browse/HDFS-15147
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15147-branch-2.10.001.patch, 
> HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, 
> HDFS-15147.003.patch
>
>
> {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of 
> the test cases:
> * the wait periods to change of status is too long. It reaches 10 secs in 
> some cases.
> * triggerBlockReport() only triggers FBR of DN with index 0. This is counter 
> intuitive because the JUnit tests restart the DN assuming that the restarted 
> DN will send a FBR. However, this never happens because the DN will get a new 
> index post restart.
> {code:java}
>   protected final void triggerBlockReport()
>   throws IOException, InterruptedException {
> // Trigger block report to NN
> DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0));
> Thread.sleep(10 * 1000);
>   }
> {code}
> [~inigoiri] suggested that we propagate the findings and fixes from 
> HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will 
> eventually reduce the runtime and make the test cases more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone

2020-02-27 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046783#comment-17046783
 ] 

Kihwal Lee commented on HDFS-15147:
---

It has been committed to trunk to branch-2.10. Thanks for working on the patch, 
Amed. Thanks for the review, [~elgoiri].

> LazyPersistTestCase wait logic is error-prone
> -
>
> Key: HDFS-15147
> URL: https://issues.apache.org/jira/browse/HDFS-15147
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15147-branch-2.10.001.patch, 
> HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, 
> HDFS-15147.003.patch
>
>
> {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of 
> the test cases:
> * the wait periods to change of status is too long. It reaches 10 secs in 
> some cases.
> * triggerBlockReport() only triggers FBR of DN with index 0. This is counter 
> intuitive because the JUnit tests restart the DN assuming that the restarted 
> DN will send a FBR. However, this never happens because the DN will get a new 
> index post restart.
> {code:java}
>   protected final void triggerBlockReport()
>   throws IOException, InterruptedException {
> // Trigger block report to NN
> DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0));
> Thread.sleep(10 * 1000);
>   }
> {code}
> [~inigoiri] suggested that we propagate the findings and fixes from 
> HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will 
> eventually reduce the runtime and make the test cases more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone

2020-02-27 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046720#comment-17046720
 ] 

Kihwal Lee commented on HDFS-15147:
---

+1 for the 3.2 patch. One minor nit is that {{FSNamesystem}} already has the 
following:

{code:java}
import static org.apache.hadoop.util.Time.monotonicNow;
{code}

So you could have simply call {{monotonicNow()}} in the patch.  But this isn't 
critical and in some sense it is more consistent with the trunk patch.

> LazyPersistTestCase wait logic is error-prone
> -
>
> Key: HDFS-15147
> URL: https://issues.apache.org/jira/browse/HDFS-15147
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-15147-branch-2.10.001.patch, 
> HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, 
> HDFS-15147.003.patch
>
>
> {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of 
> the test cases:
> * the wait periods to change of status is too long. It reaches 10 secs in 
> some cases.
> * triggerBlockReport() only triggers FBR of DN with index 0. This is counter 
> intuitive because the JUnit tests restart the DN assuming that the restarted 
> DN will send a FBR. However, this never happens because the DN will get a new 
> index post restart.
> {code:java}
>   protected final void triggerBlockReport()
>   throws IOException, InterruptedException {
> // Trigger block report to NN
> DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0));
> Thread.sleep(10 * 1000);
>   }
> {code}
> [~inigoiri] suggested that we propagate the findings and fixes from 
> HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will 
> eventually reduce the runtime and make the test cases more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone

2020-02-26 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046024#comment-17046024
 ] 

Hadoop QA commented on HDFS-15147:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 
44s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m  
0s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
43s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
36s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 1s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
33s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
23s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
26s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
54s{color} | {color:green} root generated 0 new + 1324 unchanged - 1 fixed = 
1324 total (was 1325) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
30s{color} | {color:green} root: The patch generated 0 new + 448 unchanged - 14 
fixed = 448 total (was 462) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
51s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}113m 49s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}252m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.diskbalancer.TestDiskBalancer |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:0f25cbbb251 |
| JIRA Issue | HDFS-15147 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994701/HDFS-15147-branch-3.2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 054342709d65 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone

2020-02-26 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045879#comment-17045879
 ] 

Ahmed Hussein commented on HDFS-15147:
--

Thanks [~kihwal]!
I uploaded a new patch for branch-3.2

> LazyPersistTestCase wait logic is error-prone
> -
>
> Key: HDFS-15147
> URL: https://issues.apache.org/jira/browse/HDFS-15147
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-15147-branch-2.10.001.patch, 
> HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, 
> HDFS-15147.003.patch
>
>
> {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of 
> the test cases:
> * the wait periods to change of status is too long. It reaches 10 secs in 
> some cases.
> * triggerBlockReport() only triggers FBR of DN with index 0. This is counter 
> intuitive because the JUnit tests restart the DN assuming that the restarted 
> DN will send a FBR. However, this never happens because the DN will get a new 
> index post restart.
> {code:java}
>   protected final void triggerBlockReport()
>   throws IOException, InterruptedException {
> // Trigger block report to NN
> DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0));
> Thread.sleep(10 * 1000);
>   }
> {code}
> [~inigoiri] suggested that we propagate the findings and fixes from 
> HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will 
> eventually reduce the runtime and make the test cases more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone

2020-02-26 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045646#comment-17045646
 ] 

Kihwal Lee commented on HDFS-15147:
---

I committed this to trunk, but it couldn't be cherry-picked to branch-3.2.  
Besides a couple of minor conflicts in FSNamesystem, 
{noformat}
[ERROR] /home/kihwal/devel/apache/hadoop2/hadoop-hdfs-project/hadoop-hdfs/src/
 
test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyPersistTestCase.java:[159,35]
 cannot find symbol
[ERROR] symbol: method seconds(int)
[ERROR] location: class org.junit.rules.Timeout
 {noformat}

trunk has junit 4.12 and branch-3.2 has 4.11. {{seconds(int)}} must be a new 
method in 4.12. [~ahussein], can you make a patch for branch-3.2? 

> LazyPersistTestCase wait logic is error-prone
> -
>
> Key: HDFS-15147
> URL: https://issues.apache.org/jira/browse/HDFS-15147
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: HDFS-15147-branch-2.10.001.patch, HDFS-15147.001.patch, 
> HDFS-15147.002.patch, HDFS-15147.003.patch
>
>
> {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of 
> the test cases:
> * the wait periods to change of status is too long. It reaches 10 secs in 
> some cases.
> * triggerBlockReport() only triggers FBR of DN with index 0. This is counter 
> intuitive because the JUnit tests restart the DN assuming that the restarted 
> DN will send a FBR. However, this never happens because the DN will get a new 
> index post restart.
> {code:java}
>   protected final void triggerBlockReport()
>   throws IOException, InterruptedException {
> // Trigger block report to NN
> DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0));
> Thread.sleep(10 * 1000);
>   }
> {code}
> [~inigoiri] suggested that we propagate the findings and fixes from 
> HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will 
> eventually reduce the runtime and make the test cases more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org