[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-12 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905455#comment-16905455
 ] 

He Xiaoqiao commented on HDFS-12914:


Thanks [~jojochuang] for your filing new JIRA, I think this JIRA should be 
closed now and let us tracking at HDFS-14725, what do you think?

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.8.001.patch, HDFS-12914.branch-2.8.002.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-12 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905454#comment-16905454
 ] 

He Xiaoqiao commented on HDFS-12914:


The status of this JIRA will set to resolved if no more objects. Backport for 
other branch-2.* will tracking at HDFS-14725. Please give your suggestions if 
focus on this issue. Thanks.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.8.001.patch, HDFS-12914.branch-2.8.002.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-12 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905378#comment-16905378
 ] 

He Xiaoqiao commented on HDFS-12914:


[~jojochuang] Thanks for your continue tracking this JIRA and sorry for 
following up late, I would like to try backport this patch to branch-2 after 
HDFS-14723 commit. I will assign this JIRA to myself if you do not mind.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.8.001.patch, HDFS-12914.branch-2.8.002.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-12 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905368#comment-16905368
 ] 

Chao Sun commented on HDFS-12914:
-

Thanks [~jojochuang]. I added setBlockManagerForTesting() in HDFS-13898 - 
interesting that it also got reverted along with HDFS-12914.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.8.001.patch, HDFS-12914.branch-2.8.002.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-12 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905338#comment-16905338
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


This is becoming a mess, my bad. [~Jim_Brennan] thanks a lot for letting me 
know.

HDFS-13898 added a helper method to use a helper method 
(BlockManager#setBlockManagerForTesting()) added in the branch-2 backport. 

 

Here's what I propose:

(1) File a new Jira to add the missing helper method. I don't want to revert 
HDFS-13898 because ultimately we want to cherry pick HDFS-12914 into branch-2, 
and we still need that missing helper method.

(2) resolve this Jira since this is already a mess here.

(3) I'll file a new Jira to backport HDFS-12914 to branch-2, later.

 

[~csun] FYI.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.8.001.patch, HDFS-12914.branch-2.8.002.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-12 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905289#comment-16905289
 ] 

Jim Brennan commented on HDFS-12914:


[~jojochuang], the revert of the commit for branch-2 appears to have broken the 
build:

{noformat}
[ERROR] COMPILATION ERROR : 
[INFO] -
[ERROR] 
/Users/jbrennan02/git/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NameNodeAdapter.java:[226,23]
 cannot find symbol
  symbol:   method 
setBlockManagerForTesting(org.apache.hadoop.hdfs.server.blockmanagement.BlockManager)
  location: class org.apache.hadoop.hdfs.server.namenode.FSNamesystem
[INFO] 1 error
{noformat}
When I revert this commit, I can build:
{noformat}
commit 585b6de63721f3ea8057677676038a6f8f2c33f5 (HEAD -> branch-2, 
apache-hadoop/branch-2)
Author: Wei-Chiu Chuang 
Date:   Fri Aug 9 16:59:27 2019 -0700

Revert "HDFS-12914. Block report leases cause missing blocks until next 
report. Contributed by Santosh Marella, He Xiaoqiao."

This reverts commit 567e1178d88ccfc258ce2ade4f8af66cc5a4daa7.
{noformat}


> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.8.001.patch, HDFS-12914.branch-2.8.002.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-08 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903430#comment-16903430
 ] 

Konstantin Shvachko commented on HDFS-12914:


Confirmed reverting the patch fixes {{TestSafeMode}}. [~jojochuang] please take 
a look.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.8.001.patch, HDFS-12914.branch-2.8.002.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-07 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16902478#comment-16902478
 ] 

Erik Krogen commented on HDFS-12914:


[~jojochuang] I think this is breaking 
{{TestSafeMode.testInitializeReplQueuesEarly}} in branch-2 and branch-2.9. It 
seems to be failing consistently for me after this patch and succeeding before.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.8.001.patch, HDFS-12914.branch-2.8.002.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901639#comment-16901639
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 
52s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
23s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m  0s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
24s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}112m 36s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-hdfs:1 |
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.TestSafeMode |
|   | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.TestRollingUpgrade |
| Timed out junit tests | org.apache.hadoop.hdfs.TestPread |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:b93746a0168 |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976878/HDFS-12914.branch-2.8.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4a6f8f62fab2 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 
22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901544#comment-16901544
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  8m  
5s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
45s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
39s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
42s{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
37s{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_222. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 37s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_222. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
41s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 40s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
20s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:b93746a |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976873/HDFS-12914.branch-2.8.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 3ec22f8c62ea 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2.8 / 8e302a0 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| Multi-JDK versions |  /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-06 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901495#comment-16901495
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


Doesn't reproduce any more. Pushed the patch to branch-2, branch-2.9

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.8.001.patch, HDFS-12914.branch-2.patch, 
> HDFS-12914.branch-3.0.patch, HDFS-12914.branch-3.1.001.patch, 
> HDFS-12914.branch-3.1.002.patch, HDFS-12914.branch-3.2.patch, 
> HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-04 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899610#comment-16899610
 ] 

He Xiaoqiao commented on HDFS-12914:


Thanks [~jojochuang] for pushing this backport branch-2, Just check the latest 
Jenkins report, May be it is unrelated with this patch, Please help to double 
check, Thanks. 

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-03 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899527#comment-16899527
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


One of the test failure in my branch-2 backport seems legit. 
TestSafeMode.testInitializeReplQueuesEarly timed out consecutively in my local 
tree. I'm going to look at this further.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899365#comment-16899365
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
46s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
45s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} branch-2 passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} branch-2 passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 23s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}101m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestSafeMode |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.server.namenode.TestNameNodeHttpServerXFrame |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:da675796017 |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976599/HDFS-12914.branch-2.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 549fa929141c 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 
22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / 9aae72a 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899346#comment-16899346
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HDFS-12914 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976598/HDFS-12914.002.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27389/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.002.patch, 
> HDFS-12914.005.patch, HDFS-12914.006.patch, HDFS-12914.007.patch, 
> HDFS-12914.008.patch, HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.patch, 
> HDFS-12914.branch-3.0.patch, HDFS-12914.branch-3.1.001.patch, 
> HDFS-12914.branch-3.1.002.patch, HDFS-12914.branch-3.2.patch, 
> HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16898694#comment-16898694
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
24s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 14 new + 379 unchanged - 0 fixed = 393 total (was 379) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 29s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}138m 34s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.TestPread |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys |
|   | hadoop.hdfs.TestSafeMode |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.balancer.TestBalancerRPCDelay |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:da67579 |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976476/HDFS-12914.branch-2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896727#comment-16896727
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
20s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} branch-2 passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} branch-2 passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
42s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
49s{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 49s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
45s{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_212. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 45s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_212. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 12 new + 379 unchanged - 0 fixed = 391 total (was 379) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
43s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
20s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 41s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:da67579 |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976282/HDFS-12914.branch-2.000.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f60912f60537 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / 77d1aa9 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-30 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896645#comment-16896645
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


Submitted a branch-2 patch for precommit check. This is a critical bug fix so I 
think it worths a branch-2 version.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-09 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881260#comment-16881260
 ] 

Weiwei Yang commented on HDFS-12914:


Pushed to branch-3.0, thank you [~hexiaoqiao] for the quick response and 
providing the patch.

I'll let you guys decide whether this should be fixed on 2.x versions, so keep 
this one open for now.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-09 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881255#comment-16881255
 ] 

Weiwei Yang commented on HDFS-12914:


I also think it should not related to this patch, LGTM now. I am committing 
this to branch-3.0 shortly.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-09 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881242#comment-16881242
 ] 

He Xiaoqiao commented on HDFS-12914:


It looks check the result at the same time.:) I will give my +1 for  
[^HDFS-12914.branch-3.0.patch] to branch-3.0. I believe it is time to re-apply 
to branch-3.0. Thanks.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-09 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881241#comment-16881241
 ] 

He Xiaoqiao commented on HDFS-12914:


[~cheersyang],Thanks Weiwei for your helps. Just check the failed unit test. it 
is unrelated with this patch in my side. Please help to double check. Thanks 
again.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-09 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881239#comment-16881239
 ] 

Weiwei Yang commented on HDFS-12914:


Seems UT failures were time outs, not related to the patch. [~smarella], 
[~hexiaoqiao], can you please take a look, let me know if we are good or not. 
Thanks.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881041#comment-16881041
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
 3s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} branch-3.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 13s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}169m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:e402791 |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973522/HDFS-12914.branch-3.0.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ccb24e73b8c4 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.0 / 9ab56b2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27181/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27181/testReport/ |
| Max. process+thread count | 4661 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-08 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880921#comment-16880921
 ] 

Weiwei Yang commented on HDFS-12914:


Sorry for the late response, I was on vacation.

I just reverted 974dd2b4b6103374969fd7cfeb2cee50d4112c6a, to unblock others. 
Now the patch for branch-3.0 should apply. Just triggered the jenkins job 
manually. Cc [~jojochuang] [~hexiaoqiao]

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-03 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877585#comment-16877585
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 10s{color} 
| {color:red} HDFS-12914 does not apply to branch-3.0. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973522/HDFS-12914.branch-3.0.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27133/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-03 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877571#comment-16877571
 ] 

He Xiaoqiao commented on HDFS-12914:


Thanks [~cheersyang] for your report and [~starphin] for your double-check.
Sorry for late response. the root cause of this issue about building broken:
a. HDFS-12914 cherry-pick from branch-3.1 commit to branch-3.0.
b. HDFS-11673 changes visible scope of method #processReport from private to 
default/protect and commit to branch-3.1 and later but not back-port to 
branch-3.0 and earlier version.
c. I don't check build result after [~jojochuang] help to commit.

It is safe to change the visible scope BlockManager#processReport to 
protect/default.
I just submit another patch for branch-3.0 and include changing 
BlockManager#processReport visible. cc [~jojochuang] Please help to take 
another review.
Some commit notes for branch-3.0 if anyone would like to take a review and help 
to commit:
1. Please revert HDFS-12914 patch;
2. Re-apply [^HDFS-12914.branch-3.0.patch] to branch-3.0.
Thanks [~cheersyang],[~starphin],[~jojochuang] again.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-02 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877439#comment-16877439
 ] 

star commented on HDFS-12914:
-

Yes, It is broken in branch-3.0 with 'private' indicator for 'processReport'. 
There's no such issue for other banch HDFS-11673, in which 'private' indicator 
is removed.
{quote} Collection processReport(
      final DatanodeStorageInfo storageInfo,
      final BlockListAsLongs report,
      BlockReportContext context) throws IOException {{quote}
I am not sure whether branch-3.0 should be covered for this issue. 

[~jojochuang], what do you think? 

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-07-02 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877054#comment-16877054
 ] 

Weiwei Yang commented on HDFS-12914:


Hi [~jojochuang]

It seems branch-3.0 build is broken after this commit.

{code}

[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
(default-testCompile) on project hadoop-hdfs: Compilation failure
[ERROR] 
/Users/wyang/gitbox-hadoop/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockReportLease.java:[100,46]
 
processReport(org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo,org.apache.hadoop.hdfs.protocol.BlockListAsLongs,org.apache.hadoop.hdfs.server.protocol.BlockReportContext)
 has private access in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager

{code}

could you please take a look?

Thanks

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871473#comment-16871473
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
47s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} branch-2 passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} branch-2 passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 38s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 379 unchanged - 0 fixed = 382 total (was 379) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 15s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}114m 22s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys |
|   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:da67579 |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972735/HDFS-12914.branch-2.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e5ae54fda930 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-24 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871221#comment-16871221
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


Given that this is a relatively critical issue, we should get this on branch-2 
as well.
Attached a branch-2 patch for precommit check

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-24 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870867#comment-16870867
 ] 

He Xiaoqiao commented on HDFS-12914:


[~jojochuang] Thanks for your reviews and commit. It seems that has committed 
to all branch-3.*, is it time to set this JIRA status to {{resolved}}?

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.1.001.patch, 
> HDFS-12914.branch-3.1.002.patch, HDFS-12914.branch-3.2.patch, 
> HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-23 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870775#comment-16870775
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


+1 for the branch-3.1 002 patch. Other than the TestDiskBalancer test, failed 
tests don't reproduce for me.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.1.001.patch, 
> HDFS-12914.branch-3.1.002.patch, HDFS-12914.branch-3.2.patch, 
> HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-23 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870714#comment-16870714
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


Looks like HDFS-12487 breaks the getBlockToCopy test.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.1.001.patch, 
> HDFS-12914.branch-3.1.002.patch, HDFS-12914.branch-3.2.patch, 
> HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-23 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870712#comment-16870712
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


Filed  HDFS-14598 for the findbugs warning.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.1.001.patch, 
> HDFS-12914.branch-3.1.002.patch, HDFS-12914.branch-3.2.patch, 
> HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-23 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870558#comment-16870558
 ] 

He Xiaoqiao commented on HDFS-12914:


I found failed unit tests also failed before patch. Any changes for branch-3.1 
recently?

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.1.001.patch, 
> HDFS-12914.branch-3.1.002.patch, HDFS-12914.branch-3.2.patch, 
> HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870323#comment-16870323
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
56s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
58s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in branch-3.1 has 1 
extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} branch-3.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 37s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}176m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.server.diskbalancer.TestDiskBalancer |
|   | hadoop.hdfs.qjournal.client.TestQJMWithFaults |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:080e9d0 |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972583/HDFS-12914.branch-3.1.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 5833cb5814b0 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.1 / 529d095 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27038/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-22 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870289#comment-16870289
 ] 

He Xiaoqiao commented on HDFS-12914:


submit [^HDFS-12914.branch-3.1.002.patch] for branch-3.1 and pending Jenkins.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.1.001.patch, 
> HDFS-12914.branch-3.1.002.patch, HDFS-12914.branch-3.2.patch, 
> HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869092#comment-16869092
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  9s{color} 
| {color:red} HDFS-12914 does not apply to branch-3.1. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972354/HDFS-12914.branch-3.1.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27022/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.1.001.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-20 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868612#comment-16868612
 ] 

He Xiaoqiao commented on HDFS-12914:


[~jojochuang] Submit [^HDFS-12914.branch-3.1.001.patch] for branch-3.1, Pending 
trigger Jenkins. Please take a review. Thanks.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.1.001.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-19 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867908#comment-16867908
 ] 

He Xiaoqiao commented on HDFS-12914:


[~jojochuang] Thanks for pushing this forward, I would like to upload patch 
based on branch-3.1 later. Thanks again.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-19 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867870#comment-16867870
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


[~hexiaoqiao] any chance you can offer a branch-3.1 patch?
The trunk patch applies to branch-3.2 but there are more conflicts in 
branch-3.1 it seems.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-17 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16866082#comment-16866082
 ] 

Hudson commented on HDFS-12914:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16762 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16762/])
Revert "HDFS-12914. Addendum patch. Block report leases cause missing (weichiu: 
rev a50c35bb81105936dc0129b81f913e7307e306fc)
* (delete) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockReportLease.java
Revert "HDFS-12914. Block report leases cause missing blocks until next 
(weichiu: rev 7314185c4a313842115e18b5f42d118392cee929)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
HDFS-12914. Block report leases cause missing blocks until next report. 
(weichiu: rev 6822193ee6d6ac8b08822fa76c89e1dd61c5ddca)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockReportLease.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java


> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-17 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16866067#comment-16866067
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


Thank you [~hexiaoqiao] The utfix LGTM

I am going to revert the commits from trunk and merge them the commits into 
one. Easier to track backports that way.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865082#comment-16865082
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 16s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}138m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12971904/HDFS-12914.utfix.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f2a6828c4620 4.4.0-143-generic #169~14.04.2-Ubuntu SMP Wed Feb 
13 15:00:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e70aeb4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26966/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26966/testReport/ |
| Max. process+thread count | 3449 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865075#comment-16865075
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 29s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 35s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}133m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.balancer.TestBalancer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12971904/HDFS-12914.utfix.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 814561c550e9 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e70aeb4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26965/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26965/testReport/ |
| Max. process+thread count | 4375 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-16 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865025#comment-16865025
 ] 

He Xiaoqiao commented on HDFS-12914:


Thanks [~jojochuang], Please feel free to decide if rebase and re-patch 
[^HDFS-12914.009.patch] to trunk or apply [^HDFS-12914.utfix.patch] only to fix 
unit test checkstyle. Thanks again.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-14 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864572#comment-16864572
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


Suggest to rewrite the last two parameters as 
{code:java}
HeartbeatResponse hbResponse = rpcServer.sendHeartbeat(
dnRegistration, storages, 0, 0, 0, 0, 0, null, true, 
SlowPeerReports.EMPTY_REPORT, SlowDiskReports.EMPTY_REPORT);

{code}

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.branch-3.2.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-14 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864571#comment-16864571
 ] 

He Xiaoqiao commented on HDFS-12914:


Thanks for your quick feedback and report, I will check this test fail.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.branch-3.2.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-14 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864567#comment-16864567
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


That is correct, and so I pushed an addendum patch to include the test file in 
trunk.

For branch-3.2, I squashed the two commits into one.

Also, please note that the tests fail in my IntelliJ:
{noformat}
ava.lang.IllegalArgumentException: Argument for @Nonnull parameter 'slowPeers' 
of org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.sendHeartbeat must 
not be null

at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.$$$reportNull$$$0(NameNodeRpcServer.java)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.sendHeartbeat(NameNodeRpcServer.java)
at 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockReportLease.testCheckBlockReportLease(TestBlockReportLease.java:91){noformat}
I guess we need to file another Jira to fix the tests.

 

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.branch-3.2.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-14 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864562#comment-16864562
 ] 

He Xiaoqiao commented on HDFS-12914:


Thanks [~jojochuang], it seems to miss #TestBlockReportLease file which commit 
to branch trunk. Please help to double check.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.branch-3.2.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864547#comment-16864547
 ] 

Hadoop QA commented on HDFS-12914:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
4s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
52s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 37s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  8s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}104m  
6s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:63396be |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12971847/HDFS-12914.branch-3.2.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 9b3f5cd02460 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / 335aebb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26960/testReport/ |
| Max. process+thread count | 2848 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26960/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Block report leases cause missing blocks until next report
> --
>
>   

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864418#comment-16864418
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
43s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 24s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}140m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:63396be |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12971832/HDFS-12914.branch-3.2.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 28ef5c5510ba 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / 335aebb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26958/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26958/testReport/ |
| Max. process+thread count | 3967 (vs. ulimit of 1) |
| modules | C: 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864396#comment-16864396
 ] 

Hudson commented on HDFS-12914:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16747 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16747/])
HDFS-12914. Addendum patch. Block report leases cause missing blocks (weichiu: 
rev cdc5de6448e429d6cb523b8a61bed8b1cb2fc263)
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockReportLease.java


> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-14 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864349#comment-16864349
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


I'm sorry I forgot to add the test code in to git commit. Doing that now.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.branch-3.2.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-14 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864317#comment-16864317
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


Pushed v008 to trunk. Branch-3.2 doesn't compile because of HDFS-13898. I'm 
attaching an updated patch.
The only difference is the addition of 
{{FSNamesystem#setBlockManagerForTesting()}}

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.branch-3.2.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864303#comment-16864303
 ] 

Hudson commented on HDFS-12914:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16746 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16746/])
HDFS-12914. Block report leases cause missing blocks until next report. 
(weichiu: rev ae4143a529d74d94f205ca627c31360abfa11bfa)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java


> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-12 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16862328#comment-16862328
 ] 

He Xiaoqiao commented on HDFS-12914:


[~elgoiri],[~jojochuang] any furthermore comments about [^HDFS-12914.008.patch] 
to push forward this issue?

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-11 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860702#comment-16860702
 ] 

He Xiaoqiao commented on HDFS-12914:


Thanks [~elgoiri]. [~jojochuang] do you mind give another reviews about  
[^HDFS-12914.008.patch]. Thanks again.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-10 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860391#comment-16860391
 ] 

Íñigo Goiri commented on HDFS-12914:


[^HDFS-12914.008.patch] LGTM.
+1

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860321#comment-16860321
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 23s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}162m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12971361/HDFS-12914.008.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux eaf1d106511b 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d6d95d2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26931/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26931/testReport/ |
| Max. process+thread count | 2941 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26931/console |
| 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-10 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860216#comment-16860216
 ] 

He Xiaoqiao commented on HDFS-12914:


Thanks [~elgoiri], [^HDFS-12914.008.patch] fix fail about check style report, 
also update following suggestions above. Thanks again.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-10 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860176#comment-16860176
 ] 

Íñigo Goiri commented on HDFS-12914:


[^HDFS-12914.007.patch] has a couple failed check style issues.
Given we have to fix those, I'll go again with my minor comments :)
* For UnregisteredNodeException, I was thinking on just a debug message.
* For the {{checkBlockReportLease()}}, we can save two calls by returning right 
away. I would definitely go for that:
{code}
public boolean checkBlockReportLease(BlockReportContext context,
final DatanodeID nodeID) throws UnregisteredNodeException {
  if (context == null) {
return true;
  }
  DatanodeDescriptor node = datanodeManager.getDatanode(nodeID);
  final long startTime = Time.monotonicNow();
  return blockReportLeaseManager.checkLease(
 node, startTime, context.getLeaseId());
}
{code}

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-08 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859357#comment-16859357
 ] 

He Xiaoqiao commented on HDFS-12914:


Thanks [~jojochuang],[~starphin] for reviews.
check the failed test and run at local. TestDataNodeErasureCodingMetrics 
passed, and TestBPOfferService#testTrySendErrorReportWhenStandbyNNTimesOut meet 
OutOfMemoryError, and After tune heap size high it pass too. I do not think it 
is related with [^HDFS-12914.007.patch]. Please help to double check and take 
another kindly reviews. Thanks again.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-08 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859351#comment-16859351
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


+1

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-08 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859347#comment-16859347
 ] 

star commented on HDFS-12914:
-

lgtm. Thanks [~hexiaoqiao] for anwsering. Failing test maybe checked. 

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-08 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859304#comment-16859304
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 32s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 163 unchanged - 0 fixed = 165 total (was 163) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 57s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}159m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12971249/HDFS-12914.007.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux dc446bd2144a 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9deac3b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26925/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26925/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-08 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859273#comment-16859273
 ] 

He Xiaoqiao commented on HDFS-12914:


{quote}No need to register again just in case of full block lease id 
expiration.{quote}
Correct. UnregisteredNodeException is only send when do not retrieve the 
Datanode instance rather than lease expire. Actually, we keep the logic when 
lease expire and do not send back any exceptions. FYI.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-08 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859272#comment-16859272
 ] 

star commented on HDFS-12914:
-

Yes client refers to datanode. DN get REGISTER command after the startup of NN 
and make fullblock report every 6 hours. It's ok to send other rpc request to 
NameNode after its first REGISTER command. No need to register again just in 
case of full block lease id expiration. Feel free to ignore this if doesn't 
make sense to you.
{code:java}
I think `client` you mentioned is Datanode, right? If one datanode not register 
and send some other RPC request to NameNode, it should get 
RegisterCommand.REGISTER and try to re-register. It seems a normal flow.
Thanks Íñigo Goiri and star again. HDFS-12914.007.patch please take another 
kindly review.
{code}

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-08 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859264#comment-16859264
 ] 

He Xiaoqiao commented on HDFS-12914:


Thanks [~elgoiri],[~starphin] for your reviews.
To [~elgoiri],
{quote}checkBlockReportLease() could check for context == null at the beginning 
and return true there right away; then the final return would be just the 
checkLease().{quote}
IIUC, the following code logic is same as you said. Actually, I change it based 
on HadoopQA checkstyle suggestions.:)
{code:java}
+return context == null || blockReportLeaseManager.checkLease(
+node, startTime, context.getLeaseId());
{code}
{quote}When NameNodeRpcServer catches the UnregisteredNodeException we probably 
want to log that.{quote}
We could meet UnregisteredNodeException when invoke DatanodeManager#getDatanode 
which has logged before throw exception. Maybe we do not need duplicated logs. 
Please help to double check.
{quote}We could use a lambda for runBlockOp().{quote}
+1, update in [^HDFS-12914.007.patch].
{quote}User assertNotNull() instead of assertTrue(datanodeCommand != null); 
actually can we check for the actual command?{quote}
+1, update in [^HDFS-12914.007.patch].

To [~starphin],
Thanks for your unit test attach, I am sorry that I do not understand that 
clearly. 
{quote}Following codes bypass lease expiration checking logic by removing valid 
lease id. Better to keep it as it is in running time.{quote}
In my unit test, I just want to reprod the scenario [~smarella] mentioned. 
remove valid lease id analog to lease expire between process storages of one 
datanode. I think It's OK in unit test when remove it manually. If I am wrong 
please correct me.
{quote}Do we really need to response with a RegisterCommand.REGISTER command to 
client? It's a somewhat heavy command.{quote}
I think `client` you mentioned is Datanode, right? If one datanode not register 
and send some other RPC request to NameNode, it should get 
RegisterCommand.REGISTER and try to re-register. It seems a normal flow.
Thanks [~elgoiri] and [~starphin] again. [^HDFS-12914.007.patch] please take 
another kindly review.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-08 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859250#comment-16859250
 ] 

star commented on HDFS-12914:
-

Few comments about your unit tests. 

Following codes bypass lease expiration checking logic by removing valid lease 
id. Better to keep it as it is in running time.
{code:java}
// Remove full block report lease about dn
  spyBlockManager.getBlockReportLeaseManager()
  .removeLease(datanodeDescriptor);
{code}

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-08 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859246#comment-16859246
 ] 

star commented on HDFS-12914:
-

[~hexiaoqiao], I also write a unit test for this issue, mostly similar to 
yours. Pasted here just for ref.

Other than the test code, a piece of code changed. BlockManager#processReport 
will throw IOException to indicate an invalid lease id. Client will get the 
exception.
{code:java}
if (context != null) {
  if (!blockReportLeaseManager.checkLease(node, startTime,
context.getLeaseId())) {
throw new IOException("Invalid block report lease id 
'"+context.getLeaseId()+"'");
  }
}{code}
{code:java}
@Test
public void testDelayedBlockReport() throws IOException{
  FSNamesystem namesystem = cluster.getNameNode(0).getNamesystem();

  BlockManager testBlockManager = Mockito.spy(namesystem.getBlockManager());

  Mockito.doAnswer(new Answer() {
@Override
public Boolean answer(InvocationOnMock invocationOnMock) throws Throwable {
  //sleep 1000 ms to delay processing of current report
  Thread.sleep(1000);
  return (Boolean)invocationOnMock.callRealMethod();
}
  }).when(testBlockManager).processReport(
  Mockito.any(DatanodeID.class), Mockito.any(DatanodeStorage.class),
  Mockito.any(BlockListAsLongs.class),
  Mockito.any(BlockReportContext.class));
  namesystem.setBlockManagerForTesting(testBlockManager);

  String bpid = namesystem.getBlockPoolId();
  DataNode dn = cluster.getDataNodes().get(0);
  DatanodeRegistration dnReg = dn.getDNRegistrationForBP(bpid);

  namesystem.readLock();
  long leaseId = testBlockManager.requestBlockReportLeaseId(dnReg);
  namesystem.readUnlock();

  Map report = cluster.getBlockReport(bpid, 
0);

  List reportList = new ArrayList<>();
  for(Map.Entry en : report.entrySet()){
reportList.add(new StorageBlockReport(en.getKey(), en.getValue()));
  }

  //it will throw IOException if lease id is invalid
  cluster.getNameNode().getRpcServer().blockReport(
  dnReg, bpid, reportList.toArray(new StorageBlockReport[]{}),
  new BlockReportContext(1, 0, System.nanoTime(), leaseId, true));
}
{code}

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-07 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858809#comment-16858809
 ] 

Íñigo Goiri commented on HDFS-12914:


I would like somebody with a little more experience with this to give a good 
review but for now a couple minor comments:
* {{checkBlockReportLease()}} could check for {{context == null}} at the 
beginning and return true there right away; then the final return would be just 
the {{checkLease()}}.
* When {{NameNodeRpcServer}} catches the {{UnregisteredNodeException}} we 
probably want to log that.
* We could use a lambda for {{runBlockOp()}}.
* User {{assertNotNull()}} instead of {{assertTrue(datanodeCommand != null)}}; 
actually can we check for the actual command?

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-07 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858574#comment-16858574
 ] 

He Xiaoqiao commented on HDFS-12914:


Test TestClientProtocolForPipelineRecovery, TestBootstrapAliasmap, 
TestReconstructStripedFile at local and all test pass. I don't think 
TestWebHdfsTimeouts is related with this patch.
Ping [~elgoiri],[~xkrogen],[~jojochuang],[~daryn],[~kihwal], Do you mind give 
another kindly reviews?

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858465#comment-16858465
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 37s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}130m 55s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap |
|   | hadoop.hdfs.TestReconstructStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12971152/HDFS-12914.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8ced204bed2e 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a91d24f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26921/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26921/testReport/ |
| Max. process+thread count | 4946 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-07 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858392#comment-16858392
 ] 

He Xiaoqiao commented on HDFS-12914:


[^HDFS-12914.006.patch] fix checkstyle only.
Test #TestDataNodeHotSwapVolumes and #TestDirectoryScanner at local and both 
passed. I believe #TestWebHdfsTimeouts is unrelated with this patch.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858039#comment-16858039
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 34s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 163 unchanged - 0 fixed = 168 total (was 163) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 46s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 20s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}130m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12971093/HDFS-12914.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 16c3f7d59f91 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 944adc6 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26914/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26914/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-06-06 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857935#comment-16857935
 ] 

He Xiaoqiao commented on HDFS-12914:


hi watchers, [^HDFS-12914.005.patch] try to fix this issue by checking lease 
once for every blockreport request rather than check lease for every storage. 
And add test to cover changes. Please kindly take a review. Thanks.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-22 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845764#comment-16845764
 ] 

He Xiaoqiao commented on HDFS-12914:


[~smarella], some minor comments about  [^HDFS-12914-trunk.01.patch] ,
a. we need to check if #context is null when check lease;
b. maybe we should catch #UnregisteredNodeException and return 
{{RegisterCommand.REGISTER}} also;
c. {{datanodeManager.getDatanode(nodeId)}} is possible to return null, so we 
should check {{null}} before pass as one parameter of 
BlockReportLeaseManager#checkLease;
d. it is better to add some unit test as [~starphin] mentioned above.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-22 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845746#comment-16845746
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


Thanks for the patch. I think the fix makes sense to me.
Any idea how to test it? A simple unit test (like the testRefreshLeaseId in 
TestBPOfferService added by HDFS-14314)  should be able to verify NN rejects 
expired block report entirely.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845539#comment-16845539
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  7m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 9 new + 163 unchanged - 0 fixed = 172 total (was 163) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 18s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12969339/HDFS-12914-trunk.01.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f0155baba3e5 4.4.0-143-generic #169~14.04.2-Ubuntu SMP Wed Feb 
13 15:00:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 77c49f2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845458#comment-16845458
 ] 

Hadoop QA commented on HDFS-12914:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
40s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} branch-2 passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} branch-2 passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 6 new + 198 unchanged - 0 fixed = 204 total (was 198) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 52s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys |
|   | hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.namenode.TestNameNodeHttpServerXFrame |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:da67579 |
| JIRA Issue | HDFS-12914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12969329/HDFS-12914-branch-2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-21 Thread Santosh Marella (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845430#comment-16845430
 ] 

Santosh Marella commented on HDFS-12914:


[~starphin] - thanks a lot for the feedback.

1) Good catch. I had to modify the pom.xml to compile locally, but missed out 
to remove it before uploading the diff.  Removed it now and uploaded a new 
patch.
2) Tests are certainly great to have. I didn't find any existing tests related 
to FBR leases to add/modify further.
3) {{BlockReportLeaseManager}} has good amount of logging at the DEBUG level to 
track things currently - for e.g. when the lease is granted (or cannot be 
granted), when the lease is removed etc.  Would like to hear from others if 
this is proving to be sufficient or we need some more logging.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-21 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845414#comment-16845414
 ] 

star commented on HDFS-12914:
-

Thanks [~smarella]. A few questioins for reference only.

1. Guess it's a mistake updating version of protobuf.

 
{code:java}
2.5.0.t02
{code}
2. Do we need a test case? 

3. Should we do more logs about full block lease so that we can inspect issues 
easier? [~smarella], [~hexiaoqiao],[~jojochuang].

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, HDFS-12914-trunk.00.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-21 Thread Santosh Marella (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845376#comment-16845376
 ] 

Santosh Marella commented on HDFS-12914:


Another interesting thing we found is that calls to {{getNumLiveDatanodes}} is 
probably the main cause for the slowness in FBR processing. In a way, 
HDFS-14171  might help us to avoid running into this issue in the first place. 
However, I feel it is still good to have a fix for this JIRA, as FBR processing 
can be slow due to some other reasons. A cleaner way to deal with lease expiry 
can save us from partial processing of FBRs.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Attachments: HDFS-12914-branch-2.001.patch, HDFS-12914-trunk.00.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-21 Thread Santosh Marella (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845372#comment-16845372
 ] 

Santosh Marella commented on HDFS-12914:


Thanks [~jojochuang] and [~hexiaoqiao]. 

 

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-21 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844559#comment-16844559
 ] 

He Xiaoqiao commented on HDFS-12914:


Thanks [~jojochuang] helps.
[~smarella], just quick review above changes, and I suggest submit another one 
and based on branch trunk, please retry to upload patch.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-21 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844554#comment-16844554
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


[~smarella] thanks for your contribution. I added you to the contributor list 
and assigned the jira to you.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-21 Thread Santosh Marella (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844537#comment-16844537
 ] 

Santosh Marella commented on HDFS-12914:


Thanks [~hexiaoqiao]. I realize I may not be having the right permissions and 
hence I am unable to find the "Submit Patch" button. Created INFRA-18411 
requesting for permissions.

In the mean time, here is the diff of my changes:
{code:java}
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
index ccd5931..78125d5 100644
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
@@ -2162,12 +2162,6 @@ public boolean processReport(final DatanodeID nodeID,
blockReportLeaseManager.removeLease(node);
return !node.hasStaleStorages();
}
- if (context != null) {
- if (!blockReportLeaseManager.checkLease(node, startTime,
- context.getLeaseId())) {
- return false;
- }
- }

if (storageInfo.getBlockReportCount() == 0) {
// The first block report can be processed a lot more efficiently than
@@ -2231,6 +2225,18 @@ public void removeBRLeaseIfNeeded(final DatanodeID 
nodeID,
}

/**
+ * Checks if the block report lease for {@param nodeId} has expired or not.
+ *
+ * @param nodeId data node id
+ * @param leaseId lease id
+ * @return true if the lease is still good. false if it has expired.
+ * @throws UnregisteredNodeException if the data node hasn't registered yet
+ */
+ public boolean checkLease(DatanodeID nodeId, long leaseId) throws 
UnregisteredNodeException {
+ return 
blockReportLeaseManager.checkLease(datanodeManager.getDatanode(nodeId), 
Time.monotonicNow(), leaseId);
+ }
+
+ /**
* Rescan the list of blocks which were previously postponed.
*/
void rescanPostponedMisreplicatedBlocks() {
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockReportLeaseManager.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockReportLeaseManager.java
index 7db05c7..22f1728 100644
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockReportLeaseManager.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockReportLeaseManager.java
@@ -236,7 +236,7 @@ public synchronized long requestLease(DatanodeDescriptor 
dn) {
// The DataNode wants a new lease, even though it already has one.
// This can happen if the DataNode is restarted in between requesting
// a lease and using it.
- LOG.debug("Removing existing BR lease 0x{} for DN {} in order to " +
+ LOG.warn("Removing existing BR lease 0x{} for DN {} in order to " +
"issue a new one.", Long.toHexString(node.leaseId),
dn.getDatanodeUuid());
}
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
index 89571f4..0738d99 100644
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
@@ -152,6 +152,7 @@
import org.apache.hadoop.hdfs.server.protocol.NamenodeRegistration;
import org.apache.hadoop.hdfs.server.protocol.NamespaceInfo;
import org.apache.hadoop.hdfs.server.protocol.NodeRegistration;
+import org.apache.hadoop.hdfs.server.protocol.RegisterCommand;
import org.apache.hadoop.hdfs.server.protocol.RemoteEditLogManifest;
import org.apache.hadoop.hdfs.server.protocol.SlowDiskReports;
import org.apache.hadoop.hdfs.server.protocol.SlowPeerReports;
@@ -1445,7 +1446,18 @@ public DatanodeCommand blockReport(final 
DatanodeRegistration nodeReg,
blockStateChangeLog.debug("*BLOCK* NameNode.blockReport: "
+ "from " + nodeReg + ", reports.length=" + reports.length);
}
- final BlockManager bm = namesystem.getBlockManager();
+ final BlockManager bm = namesystem.getBlockManager();
+ // Process the FBR iff the lease hasn't expired.
+ // If the lease has expired, we ask the DN to re-register and ask for
+ // a lease in subsequent heart beat.
+ if (context.getLeaseId() != 0 && !bm.checkLease(nodeReg, 
context.getLeaseId())) {
+ blockStateChangeLog.warn("*BLOCK* NameNode.blockReport: Rejecting full block 
report "
+ + "from " + nodeReg + ", reports.length=" + reports.length
+ + ", as the leaseId=" + Long.toHexString(context.getLeaseId()) + " has 
expired. "
+ + "Asking it to re-register to obtain a new lease id and then send a FBR.");
+ 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-20 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844497#comment-16844497
 ] 

He Xiaoqiao commented on HDFS-12914:


[~smarella], please click `Submit Patch` under the title then attach your 
patch. If need some other help please ping me anytime.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-20 Thread Santosh Marella (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844328#comment-16844328
 ] 

Santosh Marella commented on HDFS-12914:


The attachment didn't go through (I clicked on the "Attachment" icon when 
adding a comment). Is there a different way to do this? 

The "Attachments" section is missing on my View of this Jira page. Is that 
perhaps due to some missing permissions for me?

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-20 Thread Santosh Marella (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844326#comment-16844326
 ] 

Santosh Marella commented on HDFS-12914:


[~hexiaoqiao] attaching the patch for branch-2.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-20 Thread Santosh Marella (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844184#comment-16844184
 ] 

Santosh Marella commented on HDFS-12914:


Thanks [~hexiaoqiao]. I did make a code change in similar lines yesterday and 
the tests yielded some positive results. I can post that patch here if you're 
ok and we can probably can converge on a solution sooner. 

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-20 Thread Santosh Marella (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844178#comment-16844178
 ] 

Santosh Marella commented on HDFS-12914:


{quote} Santosh Marella how many DNs do you have?  According to the limited 
logs, I think it is caused by following case. A high cpu load of SNN delayed 
the processing of full block report.{quote}

DNs are in the order of hundreds. You are right that a high cpu load on SNN has 
delayed processing a FBR from a DN that was issued a lease. The SNN started 
processing the reports, but the lease expired after it processed 3 out of 12 
reports.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-20 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844056#comment-16844056
 ] 

star commented on HDFS-12914:
-

[~hexiaoqiao] proposed a good solution to solve this issue.

Except for rpc call queue, there's also a queue for processing block report. 
Maybe we should consider this and check lease id before putting it in the block 
report queue.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-20 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843936#comment-16843936
 ] 

He Xiaoqiao commented on HDFS-12914:


[~smarella] Thanks for your report, I think you offer complete information and 
the reason is clear. 
As we all know, block report processing is very heavy request, and process time 
may be longer than other RPCs, especially blocks number is very large located 
one DataNode and not the first block report just after NameNode startup.
To your report issue,
a. t1 request full block report lease through heart beat from NameNode.
b. t2 lease return to DataNode.
c. t3 send FBR from DataNode.
d. t4 FBR enter NameNode call queue.
e. t5 NameNode begin to process FBR one by one #StorageBlockReport, and finish 
to process first 3 #StorageBlockReport successfully.
f. t6 NameNode process the fourth #StorageBlockReport and find lease has 
expired and log `the lease has expired` then remove this lease;
g. t7 finish to process the remain 8 #StorageBlockReport and lease also has 
expired and log `the DN is not in the pending set`;
which t5 - t1 < 5min and t6 - t1 > 5min. 

I think during that times, load of NameNode is very high, and CallQueue of 
service rpc port (if not config, it is rpc port) is continued full for long 
times (maybe it is long than 5min)
As mentioned above, the root cause is that we check lease for every 
#StorageBlockReport of one DataNode. So I think the solution is also clear, 
just check lease once for each DataNode rather than every  #StorageBlockReport 
of DataNode.
I would like to follow this issue and submit patch later.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-20 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843829#comment-16843829
 ] 

star commented on HDFS-12914:
-

[~smarella] how many DNs do you have?  According to the limited logs, I think 
it is caused by following case. A high load delayed the process of full block 
report.

 
||DN1...||DN2||
|register|register|
|request Lease| |
|process Request| |
|...|request Lease|
|process Request|{color:#707070}_more than 5 minutes_{color}|
|...|process Request|

 

There's no logs between 2019-05-16 15:15:35 and 2019-05-16 15:31:11. Logs 
unrelated to 10.54.63.120:50010 are filtered out, right [~smarella]?

In that time, I think the SNN is processing blockreports from other DN. Untill 
2019-05-16 15:31:11, SNN began to process block reports from that DN. It is 6 
minutes after when full block lease id is requested, beyond default expire 
value 5 minutes (DFS_NAMENODE_FULL_BLOCK_REPORT_LEASE_LENGTH_MS_DEFAULT). 

Don't known when a full block lease id is got from server, for there's no info 
log about it. I guess it's about 5 minutes before the first failed report, say 
15:26:29. 

 

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-20 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843750#comment-16843750
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


[~starphin] any thoughts? I think you're in the best position to offer some 
comments. It would be hard for me to understand it without the access to the 
full set of NameNode/DataNode logs.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-05-19 Thread Santosh Marella (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843647#comment-16843647
 ] 

Santosh Marella commented on HDFS-12914:


[~jojochuang] - We might have hit this issue recently and some initial 
investigation seems to lead to this issue described by [~daryn] in the 
description. I'm new to this area, but from what I have learnt so far, it seems 
that HDFS-14314 fixes a related, but slightly different scenario. Please see 
below.

 

We have a DN that has 12 disks. When we restarted a standby NN, the DN 
registers itself, gets a lease for FBR and sends a FBR containing 12 reports, 
one for each disk. However, only 3 of them got processed and the 9 aren't 
processed, as the lease had expired before processing the 4th report. This 
essentially meant the FBR was partially processed, and, in our case, this 
*might* be one of the reasons it's taking too long for the NN to come out of 
safe mode (as the safe block count is taking too long to reach the threshold 
due to partial FBR processing).

 

Some raw notes that I've collected investigating this issue. Sorry for being 
verbose, but hope it helps everyone. We are on 2.9.2 version of Hadoop.

Dug through the logs and observed this for a DN for which only 3 out of 12 
reports were processed. The DN registered itself with the NN, then sent a FBR 
that contained 12 reports (one for each disk). The NN processed 3 of them 
(indicated by *processing first storage report*  and *processing time* entries 
in the log statements). However, for the 9 remaining reports, it prints *lease 
xxx is not valid for DN* messages.  

 
{code:java}
2019-05-16 15:15:35,028 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
registerDatanode: from DatanodeRegistration(10.54.63.120:50010, 
datanodeUuid=7493442a-c552-43f4-b6bd-728be292f66d, infoPort=50075, 
infoSecurePort=0, ipcPort=50020, 
storageInfo=lv=-57;cid=CID-f4a0a2ae-9e3d-41d5-b98a-e0e77ed0249b;nsid=682930173;c=1406912757005)
 storage 7493442a-c552-43f4-b6bd-728be292f66d
2019-05-16 15:15:35,028 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockReportLeaseManager: 
Registered DN 7493442a-c552-43f4-b6bd-728be292f66d (10.54.63.120:50010).
2019-05-16 15:31:11,578 INFO BlockStateChange: BLOCK* processReport 
0xb4fb52822c9e3f03: Processing first storage report for 
DS-3e8d8352-ecc9-45cb-a39b-86f10d8aa386 from datanode 
7493442a-c552-43f4-b6bd-728be292f66d
2019-05-16 15:31:11,941 INFO BlockStateChange: BLOCK* processReport 
0xb4fb52822c9e3f03: from storage DS-3e8d8352-ecc9-45cb-a39b-86f10d8aa386 node 
DatanodeRegistration(10.54.63.120:50010, 
datanodeUuid=7493442a-c552-43f4-b6bd-728be292f66d, infoPort=50075, 
infoSecurePort=0, ipcPort=50020, 
storageInfo=lv=-57;cid=CID-f4a0a2ae-9e3d-41d5-b98a-e0e77ed0249b;nsid=682930173;c=1406912757005),
 blocks: 12690, hasStaleStorage: true, processing time: 363 msecs, 
invalidatedBlocks: 0
2019-05-16 15:31:17,496 INFO BlockStateChange: BLOCK* processReport 
0xb4fb52822c9e3f03: Processing first storage report for 
DS-600c8ca1-3f99-41fc-a784-f663b928fe21 from datanode 
7493442a-c552-43f4-b6bd-728be292f66d
2019-05-16 15:31:17,851 INFO BlockStateChange: BLOCK* processReport 
0xb4fb52822c9e3f03: from storage DS-600c8ca1-3f99-41fc-a784-f663b928fe21 node 
DatanodeRegistration(10.54.63.120:50010, 
datanodeUuid=7493442a-c552-43f4-b6bd-728be292f66d, infoPort=50075, 
infoSecurePort=0, ipcPort=50020, 
storageInfo=lv=-57;cid=CID-f4a0a2ae-9e3d-41d5-b98a-e0e77ed0249b;nsid=682930173;c=1406912757005),
 blocks: 12670, hasStaleStorage: true, processing time: 355 msecs, 
invalidatedBlocks: 0
2019-05-16 15:31:23,465 INFO BlockStateChange: BLOCK* processReport 
0xb4fb52822c9e3f03: Processing first storage report for 
DS-3e7dc4c5-ab4e-40d1-8f32-64fe28081f94 from datanode 
7493442a-c552-43f4-b6bd-728be292f66d
2019-05-16 15:31:23,821 INFO BlockStateChange: BLOCK* processReport 
0xb4fb52822c9e3f03: from storage DS-3e7dc4c5-ab4e-40d1-8f32-64fe28081f94 node 
DatanodeRegistration(10.54.63.120:50010, 
datanodeUuid=7493442a-c552-43f4-b6bd-728be292f66d, infoPort=50075, 
infoSecurePort=0, ipcPort=50020, 
storageInfo=lv=-57;cid=CID-f4a0a2ae-9e3d-41d5-b98a-e0e77ed0249b;nsid=682930173;c=1406912757005),
 blocks: 12698, hasStaleStorage: true, processing time: 356 msecs, 
invalidatedBlocks: 0
2019-05-16 15:31:29,419 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockReportLeaseManager: Removing 
expired block report lease 0xfd013f0084d0ed2d for DN 
7493442a-c552-43f4-b6bd-728be292f66d.
2019-05-16 15:31:29,419 WARN 
org.apache.hadoop.hdfs.server.blockmanagement.BlockReportLeaseManager: BR lease 
0xfd013f0084d0ed2d is not valid for DN 7493442a-c552-43f4-b6bd-728be292f66d, 
because the lease has expired.
2019-05-16 15:31:35,891 WARN 
org.apache.hadoop.hdfs.server.blockmanagement.BlockReportLeaseManager: BR lease 
0xfd013f0084d0ed2d is not valid for DN 7493442a-c552-43f4-b6bd-728be292f66d, 
because the DN is 

[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-03-04 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783650#comment-16783650
 ] 

Wei-Chiu Chuang commented on HDFS-12914:


For everyone on the watch list, looks like HDFS-14314 fixes the issue. I gave 
my +1 and will cherrypick the fix into lower branches.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >