[jira] [Updated] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-10 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14983:
--
Attachment: HDFS-14983.006.patch
Status: Patch Available  (was: Open)

Removed a redundant white space.

{code:java}
diff -u 
b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/tools/federation/RouterAdmin.java
 
b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/tools/federation/RouterAdmin.java
--- 
b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/tools/federation/RouterAdmin.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/tools/federation/RouterAdmin.java
@@ -413,7 +413,7 @@
 "Successfully updated superuser proxy groups on router " + address);
 return 0;
 }
- return -1;
+ return -1;
 }

private void refresh(String address) throws IOException {
{code}
 



> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.003.patch, 
> HDFS-14983.004.patch, HDFS-14983.005.patch, HDFS-14983.006.patch, 
> HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-10 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14983:
--
Status: Open  (was: Patch Available)

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.003.patch, 
> HDFS-14983.004.patch, HDFS-14983.005.patch, HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993255#comment-16993255
 ] 

Hadoop QA commented on HDFS-14983:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
44s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
16s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
7s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  3m  7s{color} | 
{color:red} hadoop-hdfs-project generated 3 new + 16 unchanged - 3 fixed = 19 
total (was 19) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 25s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}107m  7s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
32s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}186m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDeadNodeDetection |
|   | hadoop.hdfs.server.namenode.TestFsck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14983 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988483/HDFS-14983.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux aad4dffe5736 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-10 Thread Chen Liang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-15036:
--
Attachment: HDFS-15036.003.patch

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch, 
> HDFS-15036.003.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-10 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993224#comment-16993224
 ] 

Chen Liang commented on HDFS-15036:
---

Thanks for the review [~shv], uploaded v03 patch

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch, 
> HDFS-15036.003.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14758) Decrease lease hard limit

2019-12-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993191#comment-16993191
 ] 

hemanthboyina commented on HDFS-14758:
--

test failures and findbug is not realted . Please review the patch

> Decrease lease hard limit
> -
>
> Key: HDFS-14758
> URL: https://issues.apache.org/jira/browse/HDFS-14758
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Eric Payne
>Assignee: hemanthboyina
>Priority: Minor
> Attachments: HDFS-14758.001.patch, HDFS-14758.002.patch, 
> HDFS-14758.003.patch, HDFS-14758.004.patch
>
>
> The hard limit is currently hard-coded to be 1 hour. This also determines the 
> NN automatic lease recovery interval. Something like 20 min will make more 
> sense.
> After the 5 min soft limit, other clients can recover the lease. If no one 
> else takes the lease away, the original client still can renew the lease 
> within the hard limit. So even after a NN full GC of 8 minutes, leases can be 
> still valid.
> However, there is one risk in reducing the hard limit. E.g. Reduced to 20 
> min. If the NN crashes and the manual failover takes more than 20 minutes, 
> clients will abort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-10 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993174#comment-16993174
 ] 

Xieming Li commented on HDFS-14983:
---

[~tasanuma], 

Thank you for your review.

I have posted a new patch that altered the 
RouterAdminServer#refreshSuperUserGroupsConfiguration() so that it pass the 
return code to the caller.
{code:java}
diff -u 
b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/tools/federation/RouterAdmin.java
 
b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/tools/federation/RouterAdmin.java
--- 
b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/tools/federation/RouterAdmin.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/tools/federation/RouterAdmin.java
@@ -357,7 +357,7 @@
   } else if ("-refreshRouterArgs".equals(cmd)) {
 exitCode = genericRefresh(argv, i);
   } else if ("-refreshSuperUserGroupsConfiguration".equals(cmd)) {
-refreshSuperUserGroupsConfiguration();
+exitCode = refreshSuperUserGroupsConfiguration();
   } else {
 throw new IllegalArgumentException("Unknown Command: " + cmd);
   }
@@ -402,7 +402,7 @@
*
* @throws IOException if the operation was not successful.
*/
-  private void refreshSuperUserGroupsConfiguration()
+  private int refreshSuperUserGroupsConfiguration()
   throws IOException{
 RouterGenericManager proxy = client.getRouterGenericManager();
 String address =  getConf().getTrimmed(
@@ -411,7 +411,9 @@
 if(proxy.refreshSuperUserGroupsConfiguration()){
   System.out.println(
   "Successfully updated superuser proxy groups on router " + address);
+  return 0;
 }
+return  -1;
   }   private void refresh(String address) throws IOException {
{code}

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.003.patch, 
> HDFS-14983.004.patch, HDFS-14983.005.patch, HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-10 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14983:
--
Status: Open  (was: Patch Available)

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.003.patch, 
> HDFS-14983.004.patch, HDFS-14983.005.patch, HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-10 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14983:
--
Attachment: HDFS-14983.005.patch
Status: Patch Available  (was: Open)

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.003.patch, 
> HDFS-14983.004.patch, HDFS-14983.005.patch, HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993166#comment-16993166
 ] 

Hudson commented on HDFS-15045:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17750 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17750/])
HDFS-15045. DataStreamer#createBlockOutputStream() should log exception 
(surendralilhore: rev c2e9783d5f236015f2ad826fcbad061e2118e454)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java


> DataStreamer#createBlockOutputStream() should log exception in warn.
> 
>
> Key: HDFS-15045
> URL: https://issues.apache.org/jira/browse/HDFS-15045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-15045.001.patch
>
>
> {code:java}
> } catch (IOException ie) {
> if (!errorState.isRestartingNode()) {
>   LOG.info("Exception in createBlockOutputStream " + this, ie);
> } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-15045:
--
Fix Version/s: 3.3.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> DataStreamer#createBlockOutputStream() should log exception in warn.
> 
>
> Key: HDFS-15045
> URL: https://issues.apache.org/jira/browse/HDFS-15045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-15045.001.patch
>
>
> {code:java}
> } catch (IOException ie) {
> if (!errorState.isRestartingNode()) {
>   LOG.info("Exception in createBlockOutputStream " + this, ie);
> } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993161#comment-16993161
 ] 

Surendra Singh Lilhore commented on HDFS-15045:
---

Committed to trunk.

Thanks [~Sushma_28]  for contribution.

> DataStreamer#createBlockOutputStream() should log exception in warn.
> 
>
> Key: HDFS-15045
> URL: https://issues.apache.org/jira/browse/HDFS-15045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15045.001.patch
>
>
> {code:java}
> } catch (IOException ie) {
> if (!errorState.isRestartingNode()) {
>   LOG.info("Exception in createBlockOutputStream " + this, ie);
> } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-10 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993150#comment-16993150
 ] 

Konstantin Shvachko commented on HDFS-15036:


Looks good. Minor things
# Typo in {{doCheckpoint()}}. Removed -is- in:
{code}// by the other node. This could happen if{code}
# Should use parameterized logging
{code}LOG.info("Image upload rejected by the other NameNode: {}", 
uploadResult);{code}


> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15047) Document the new decommission monitor (HDFS-14854)

2019-12-10 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki reassigned HDFS-15047:
---

Assignee: Masatake Iwasaki

> Document the new decommission monitor (HDFS-14854)
> --
>
> Key: HDFS-15047
> URL: https://issues.apache.org/jira/browse/HDFS-15047
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Wei-Chiu Chuang
>Assignee: Masatake Iwasaki
>Priority: Major
>
> We can document HDFS-14854, add it to 
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDataNodeAdminGuide.html
>  and mark it as an experimental feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993133#comment-16993133
 ] 

Hadoop QA commented on HDFS-15036:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
34s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 88 unchanged - 0 fixed = 89 total (was 88) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 50s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 17s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}169m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.qjournal.client.TestQuorumJournalManager |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
|   | hadoop.hdfs.TestFileAppend2 |
|   | hadoop.hdfs.server.namenode.TestFsck |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.qjournal.client.TestQJMWithFaults |
|   | hadoop.hdfs.TestWriteRead |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15036 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988468/HDFS-15036.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 21686e70fb56 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 875a3e9 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| findbugs | 

[jira] [Commented] (HDFS-15032) Balancer crashes when it fails to contact an unavailable NN via ObserverReadProxyProvider

2019-12-10 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993131#comment-16993131
 ] 

Konstantin Shvachko commented on HDFS-15032:


+1 v05 patch.
I see TestFsck failing with and without this patch. I assume it was broken 
recently.

> Balancer crashes when it fails to contact an unavailable NN via 
> ObserverReadProxyProvider
> -
>
> Key: HDFS-15032
> URL: https://issues.apache.org/jira/browse/HDFS-15032
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.10.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-15032.000.patch, HDFS-15032.001.patch, 
> HDFS-15032.002.patch, HDFS-15032.003.patch, HDFS-15032.004.patch, 
> HDFS-15032.005.patch, debugger_with_tostring.png, 
> debugger_without_tostring.png
>
>
> When trying to run the Balancer using ObserverReadProxyProvider (to allow it 
> to read from the Observer Node as described in HDFS-14979), if one of the NNs 
> isn't running, the Balancer will crash.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation

2019-12-10 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993123#comment-16993123
 ] 

Hudson commented on HDFS-14854:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17749 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17749/])
HDFS-14854. Create improved decommission monitor implementation. (weichiu: rev 
c93cb6790e0f1c64efd03d859f907a0522010894)
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommissionWithStripedBackoffMonitor.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommissionWithStriped.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminBackoffMonitor.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminMonitorBase.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatusWithBackoffMonitor.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommissionWithBackoffMonitor.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminMonitorInterface.java


> Create improved decommission monitor implementation
> ---
>
> Key: HDFS-14854
> URL: https://issues.apache.org/jira/browse/HDFS-14854
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: 012_to_013_changes.diff, 
> Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, HDFS-14854.002.patch, 
> HDFS-14854.003.patch, HDFS-14854.004.patch, HDFS-14854.005.patch, 
> HDFS-14854.006.patch, HDFS-14854.007.patch, HDFS-14854.008.patch, 
> HDFS-14854.009.patch, HDFS-14854.010.patch, HDFS-14854.011.patch, 
> HDFS-14854.012.patch, HDFS-14854.013.patch, HDFS-14854.014.patch
>
>
> In HDFS-13157, we discovered a series of problems with the current 
> decommission monitor implementation, such as:
>  * Blocks are replicated sequentially disk by disk and node by node, and 
> hence the load is not spread well across the cluster
>  * Adding a node for decommission can cause the namenode write lock to be 
> held for a long time.
>  * Decommissioning nodes floods the replication queue and under replicated 
> blocks from a future node or disk failure may way for a long time before they 
> are replicated.
>  * Blocks pending replication are checked many times under a write lock 
> before they are sufficiently replicate, wasting resources
> In this Jira I propose to create a new implementation of the decommission 
> monitor that resolves these issues. As it will be difficult to prove one 
> implementation is better than another, the new implementation can be enabled 
> or disabled giving the option of the existing implementation or the new one.
> I will attach a pdf with some more details on the design and then a version 1 
> patch shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14854) Create improved decommission monitor implementation

2019-12-10 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14854:
---
Fix Version/s: 3.3.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to trunk for 3.3.0 release.
Thanks numerous iterations of review from [~belugabehr] and [~elgoiri], and 
thanks [~sodonnell] for offering the patch.

I filed HDFS-15047 to document this feature so that we can get community to use 
this feature and get feedback.

> Create improved decommission monitor implementation
> ---
>
> Key: HDFS-14854
> URL: https://issues.apache.org/jira/browse/HDFS-14854
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: 012_to_013_changes.diff, 
> Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, HDFS-14854.002.patch, 
> HDFS-14854.003.patch, HDFS-14854.004.patch, HDFS-14854.005.patch, 
> HDFS-14854.006.patch, HDFS-14854.007.patch, HDFS-14854.008.patch, 
> HDFS-14854.009.patch, HDFS-14854.010.patch, HDFS-14854.011.patch, 
> HDFS-14854.012.patch, HDFS-14854.013.patch, HDFS-14854.014.patch
>
>
> In HDFS-13157, we discovered a series of problems with the current 
> decommission monitor implementation, such as:
>  * Blocks are replicated sequentially disk by disk and node by node, and 
> hence the load is not spread well across the cluster
>  * Adding a node for decommission can cause the namenode write lock to be 
> held for a long time.
>  * Decommissioning nodes floods the replication queue and under replicated 
> blocks from a future node or disk failure may way for a long time before they 
> are replicated.
>  * Blocks pending replication are checked many times under a write lock 
> before they are sufficiently replicate, wasting resources
> In this Jira I propose to create a new implementation of the decommission 
> monitor that resolves these issues. As it will be difficult to prove one 
> implementation is better than another, the new implementation can be enabled 
> or disabled giving the option of the existing implementation or the new one.
> I will attach a pdf with some more details on the design and then a version 1 
> patch shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15047) Document the new decommission monitor (HDFS-14854)

2019-12-10 Thread Wei-Chiu Chuang (Jira)
Wei-Chiu Chuang created HDFS-15047:
--

 Summary: Document the new decommission monitor (HDFS-14854)
 Key: HDFS-15047
 URL: https://issues.apache.org/jira/browse/HDFS-15047
 Project: Hadoop HDFS
  Issue Type: Task
  Components: documentation
Affects Versions: 3.3.0
Reporter: Wei-Chiu Chuang


We can document HDFS-14854, add it to 
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDataNodeAdminGuide.html
 and mark it as an experimental feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation

2019-12-10 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993108#comment-16993108
 ] 

Wei-Chiu Chuang commented on HDFS-14854:


Committing 014 patch

> Create improved decommission monitor implementation
> ---
>
> Key: HDFS-14854
> URL: https://issues.apache.org/jira/browse/HDFS-14854
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: 012_to_013_changes.diff, 
> Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, HDFS-14854.002.patch, 
> HDFS-14854.003.patch, HDFS-14854.004.patch, HDFS-14854.005.patch, 
> HDFS-14854.006.patch, HDFS-14854.007.patch, HDFS-14854.008.patch, 
> HDFS-14854.009.patch, HDFS-14854.010.patch, HDFS-14854.011.patch, 
> HDFS-14854.012.patch, HDFS-14854.013.patch, HDFS-14854.014.patch
>
>
> In HDFS-13157, we discovered a series of problems with the current 
> decommission monitor implementation, such as:
>  * Blocks are replicated sequentially disk by disk and node by node, and 
> hence the load is not spread well across the cluster
>  * Adding a node for decommission can cause the namenode write lock to be 
> held for a long time.
>  * Decommissioning nodes floods the replication queue and under replicated 
> blocks from a future node or disk failure may way for a long time before they 
> are replicated.
>  * Blocks pending replication are checked many times under a write lock 
> before they are sufficiently replicate, wasting resources
> In this Jira I propose to create a new implementation of the decommission 
> monitor that resolves these issues. As it will be difficult to prove one 
> implementation is better than another, the new implementation can be enabled 
> or disabled giving the option of the existing implementation or the new one.
> I will attach a pdf with some more details on the design and then a version 1 
> patch shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15046) Backport HDFS-7060 to branch-2.10

2019-12-10 Thread Wei-Chiu Chuang (Jira)
Wei-Chiu Chuang created HDFS-15046:
--

 Summary: Backport HDFS-7060 to branch-2.10
 Key: HDFS-15046
 URL: https://issues.apache.org/jira/browse/HDFS-15046
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Wei-Chiu Chuang


Not sure why it didn't get backported in 2.x before, but looks like a good 
improvement overall.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-10 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993033#comment-16993033
 ] 

Chen Liang commented on HDFS-15036:
---

Thanks for taking a look [~shv]! Post v002 patch. And the failed tests all 
passed in my local run.

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-10 Thread Chen Liang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-15036:
--
Attachment: HDFS-15036.002.patch

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14997) BPServiceActor process command from NameNode asynchronously

2019-12-10 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993014#comment-16993014
 ] 

Wei-Chiu Chuang commented on HDFS-14997:


Sounds like an elegant solution. The commands received while processed 
asynchronously, are processed in order.

Question: I can imagine DNA_INVALIDATE taking a long time to process. Do we 
already have metrics or log messages when the block invalidation command taking 
too long?

> BPServiceActor process command from NameNode asynchronously
> ---
>
> Key: HDFS-14997
> URL: https://issues.apache.org/jira/browse/HDFS-14997
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-14997.001.patch, HDFS-14997.002.patch, 
> HDFS-14997.003.patch, HDFS-14997.004.patch, HDFS-14997.005.patch
>
>
> There are two core functions, report(#sendHeartbeat, #blockReport, 
> #cacheReport) and #processCommand in #BPServiceActor main process flow. If 
> processCommand cost long time it will block send report flow. Meanwhile 
> processCommand could cost long time(over 1000s the worst case I meet) when IO 
> load  of DataNode is very high. Since some IO operations are under 
> #datasetLock, So it has to wait to acquire #datasetLock long time when 
> process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat 
> will not send to NameNode in-time, and trigger other disasters.
> I propose to improve #processCommand asynchronously and not block 
> #BPServiceActor to send heartbeat back to NameNode when meet high IO load.
> Notes:
> 1. Lifeline could be one effective solution, however some old branches are 
> not support this feature.
> 2. IO operations under #datasetLock is another issue, I think we should solve 
> it at another JIRA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15032) Balancer crashes when it fails to contact an unavailable NN via ObserverReadProxyProvider

2019-12-10 Thread Erik Krogen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992999#comment-16992999
 ] 

Erik Krogen commented on HDFS-15032:


I'm not seeing the test failures locally and it looks like the findbugs report 
is not related to this. v5 should be ready for review [~shv]

> Balancer crashes when it fails to contact an unavailable NN via 
> ObserverReadProxyProvider
> -
>
> Key: HDFS-15032
> URL: https://issues.apache.org/jira/browse/HDFS-15032
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.10.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-15032.000.patch, HDFS-15032.001.patch, 
> HDFS-15032.002.patch, HDFS-15032.003.patch, HDFS-15032.004.patch, 
> HDFS-15032.005.patch, debugger_with_tostring.png, 
> debugger_without_tostring.png
>
>
> When trying to run the Balancer using ObserverReadProxyProvider (to allow it 
> to read from the Observer Node as described in HDFS-14979), if one of the NNs 
> isn't running, the Balancer will crash.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14758) Decrease lease hard limit

2019-12-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992981#comment-16992981
 ] 

Hadoop QA commented on HDFS-14758:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m  
5s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 658 
unchanged - 1 fixed = 658 total (was 659) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 4 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
54s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 21s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}173m 14s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestFsck |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.TestDistributedFileSystem |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14758 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988454/HDFS-14758.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux d953e3f5024a 4.15.0-58-generic 

[jira] [Commented] (HDFS-15032) Balancer crashes when it fails to contact an unavailable NN via ObserverReadProxyProvider

2019-12-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992979#comment-16992979
 ] 

Hadoop QA commented on HDFS-15032:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
27s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
42s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 45s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
54s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m  9s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
48s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}221m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
|   | hadoop.hdfs.TestLeaseRecovery |
|   | hadoop.hdfs.server.namenode.TestFsck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15032 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988444/HDFS-15032.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cad20d182507 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 875a3e9 |
| maven | version: Apache Maven 3.3.9 |
| 

[jira] [Updated] (HDFS-15008) Expose client API to carry out In-Place EC of a file.

2019-12-10 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDFS-15008:
-
Description: 
Converting a file will be a 3 step task:
* Make a temporary, erasure-coded copy.
* swapBlockList($src, $tmp) (HDFS-14989)
* delete($tmp)

The API should allow the EC blocks to use the storage policy of the original 
file through a config parameter.

  was:
Converting a file will be a 3 step task:
* Make a temporary, erasure-coded copy.
* swapBlockList($src, $tmp) (HDFS-14989)
* delete($tmp)


> Expose client API to carry out In-Place EC of a file.
> -
>
> Key: HDFS-15008
> URL: https://issues.apache.org/jira/browse/HDFS-15008
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Converting a file will be a 3 step task:
> * Make a temporary, erasure-coded copy.
> * swapBlockList($src, $tmp) (HDFS-14989)
> * delete($tmp)
> The API should allow the EC blocks to use the storage policy of the original 
> file through a config parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-10 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned HDFS-15041:
--

Assignee: zhuqi

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch, HDFS-15041.002.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.

2019-12-10 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-14989 started by Aravindan Vijayan.

> Add a 'swapBlockList' operation to Namenode.
> 
>
> Key: HDFS-14989
> URL: https://issues.apache.org/jira/browse/HDFS-14989
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Borrowing from the design doc.
> bq. The swapBlockList takes two parameters, a source file and a destination 
> file. This operation swaps the blocks belonging to the source and the 
> destination atomically.
> bq. The namespace metadata of interest is the INodeFile class. A file 
> (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, 
> BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile 
> contains a list of blocks (BlockInfo[]). The operation will swap 
> BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not 
> touch other fields. To avoid complication, this operation will abort if 
> either file is open (isUnderConstruction() == true)
> bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST 
> to record the change persistently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10756) Expose getTrashRoot to HTTPFS and WebHDFS

2019-12-10 Thread Daryn Sharp (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-10756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992923#comment-16992923
 ] 

Daryn Sharp commented on HDFS-10756:


Surprised nobody has discovered this will lead to an inevitable OOM in the NN.  
The NN should not be creating filesystems to itself, and must never create 
filesystems in a remote user's context or the cache will explode.

> Expose getTrashRoot to HTTPFS and WebHDFS
> -
>
> Key: HDFS-10756
> URL: https://issues.apache.org/jira/browse/HDFS-10756
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: encryption, httpfs, webhdfs
>Reporter: Xiao Chen
>Assignee: Yuanbo Liu
>Priority: Major
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-10756.001.patch, HDFS-10756.002.patch, 
> HDFS-10756.003.patch, HDFS-10756.004.patch, HDFS-10756.005.patch, 
> HDFS-10756.006.patch, HDFS-10756.007.patch
>
>
> Currently, hadoop FileSystem API has 
> [getTrashRoot|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2708]
>  to determine trash directory at run time. Default trash dir is under 
> {{/user/$USER}}
> For an encrypted file, since moving files between/in/out of EZs are not 
> allowed, when an EZ file is deleted via CLI, it calls in to [DFS 
> implementation|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java#L2485]
>  to move the file to a trash directory under the same EZ.
> This works perfectly fine for CLI users or java users who call FileSystem 
> API. But for users via httpfs/webhdfs, currently there is no way to figure 
> out what the trash root would be. This jira is proposing we add such 
> interface to httpfs and webhdfs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13351) Revert HDFS-11156 from branch-2/branch-2.8

2019-12-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992915#comment-16992915
 ] 

Hadoop QA commented on HDFS-13351:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 15s{color} 
| {color:red} HDFS-13351 does not apply to branch-2. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-13351 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918911/HDFS-13351-branch-2.003.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28496/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Revert HDFS-11156 from branch-2/branch-2.8
> --
>
> Key: HDFS-13351
> URL: https://issues.apache.org/jira/browse/HDFS-13351
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: webhdfs
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: HDFS-13351-branch-2.001.patch, 
> HDFS-13351-branch-2.002.patch, HDFS-13351-branch-2.003.patch
>
>
> Per discussion in HDFS-11156, lets revert the change from branch-2 and 
> branch-2.8. New patch can be tracked in HDFS-12459 .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13351) Revert HDFS-11156 from branch-2/branch-2.8

2019-12-10 Thread Daryn Sharp (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992911#comment-16992911
 ] 

Daryn Sharp commented on HDFS-13351:


Prepping for 2.10.  This was never reverted.  It's completely broken, please 
finish the revert.

> Revert HDFS-11156 from branch-2/branch-2.8
> --
>
> Key: HDFS-13351
> URL: https://issues.apache.org/jira/browse/HDFS-13351
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: webhdfs
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: HDFS-13351-branch-2.001.patch, 
> HDFS-13351-branch-2.002.patch, HDFS-13351-branch-2.003.patch
>
>
> Per discussion in HDFS-11156, lets revert the change from branch-2 and 
> branch-2.8. New patch can be tracked in HDFS-12459 .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992868#comment-16992868
 ] 

Hadoop QA commented on HDFS-15045:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
3m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  5s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
3s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15045 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988433/HDFS-15045.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cb6472647629 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 875a3e9 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28494/testReport/ |
| Max. process+thread count | 307 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: 
hadoop-hdfs-project/hadoop-hdfs-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28494/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> 

[jira] [Commented] (HDFS-14901) RBF: Add Encryption Zone related ClientProtocol APIs

2019-12-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992867#comment-16992867
 ] 

Íñigo Goiri commented on HDFS-14901:


For the single destination mount points we are fine, for the other I think this 
might be the same case as the delegation tokens.
[~crh], any suggestion? Should we do the federated token approach?

> RBF: Add Encryption Zone related ClientProtocol APIs
> 
>
> Key: HDFS-14901
> URL: https://issues.apache.org/jira/browse/HDFS-14901
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14901.001.patch, HDFS-14901.002.patch, 
> HDFS-14901.003.patch
>
>
> Currently listEncryptionZones,reencryptEncryptionZone,listReencryptionStatus 
> these APIs are not implemented in Router.
> This JIRA is intend to implement above mentioned APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14758) Decrease lease hard limit

2019-12-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992846#comment-16992846
 ] 

hemanthboyina commented on HDFS-14758:
--

updated the patch with checkstyle issue fixed , please review

> Decrease lease hard limit
> -
>
> Key: HDFS-14758
> URL: https://issues.apache.org/jira/browse/HDFS-14758
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Eric Payne
>Assignee: hemanthboyina
>Priority: Minor
> Attachments: HDFS-14758.001.patch, HDFS-14758.002.patch, 
> HDFS-14758.003.patch, HDFS-14758.004.patch
>
>
> The hard limit is currently hard-coded to be 1 hour. This also determines the 
> NN automatic lease recovery interval. Something like 20 min will make more 
> sense.
> After the 5 min soft limit, other clients can recover the lease. If no one 
> else takes the lease away, the original client still can renew the lease 
> within the hard limit. So even after a NN full GC of 8 minutes, leases can be 
> still valid.
> However, there is one risk in reducing the hard limit. E.g. Reduced to 20 
> min. If the NN crashes and the manual failover takes more than 20 minutes, 
> clients will abort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14758) Decrease lease hard limit

2019-12-10 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14758:
-
Attachment: HDFS-14758.004.patch

> Decrease lease hard limit
> -
>
> Key: HDFS-14758
> URL: https://issues.apache.org/jira/browse/HDFS-14758
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Eric Payne
>Assignee: hemanthboyina
>Priority: Minor
> Attachments: HDFS-14758.001.patch, HDFS-14758.002.patch, 
> HDFS-14758.003.patch, HDFS-14758.004.patch
>
>
> The hard limit is currently hard-coded to be 1 hour. This also determines the 
> NN automatic lease recovery interval. Something like 20 min will make more 
> sense.
> After the 5 min soft limit, other clients can recover the lease. If no one 
> else takes the lease away, the original client still can renew the lease 
> within the hard limit. So even after a NN full GC of 8 minutes, leases can be 
> still valid.
> However, there is one risk in reducing the hard limit. E.g. Reduced to 20 
> min. If the NN crashes and the manual failover takes more than 20 minutes, 
> clients will abort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14901) RBF: Add Encryption Zone related ClientProtocol APIs

2019-12-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992840#comment-16992840
 ] 

hemanthboyina commented on HDFS-14901:
--

{quote}GetDataEncryption I have a doubt, if the client wants a key from one 
Namespace, we return from another, how this will be handled? How can we ensure 
he gets from the correct NS.
{quote}
any suggestions for this [~elgoiri] ?

> RBF: Add Encryption Zone related ClientProtocol APIs
> 
>
> Key: HDFS-14901
> URL: https://issues.apache.org/jira/browse/HDFS-14901
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14901.001.patch, HDFS-14901.002.patch, 
> HDFS-14901.003.patch
>
>
> Currently listEncryptionZones,reencryptEncryptionZone,listReencryptionStatus 
> these APIs are not implemented in Router.
> This JIRA is intend to implement above mentioned APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2019-12-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992836#comment-16992836
 ] 

hemanthboyina commented on HDFS-6874:
-

have tried to implement get_block_locations in HttpFS , but facing an issue

Webhdfs when  calls  getblocklocations ,  NN sends LocatedBlocks and webhdfs 
parses this locatedblocks.

+HttpFswithWebHDFS+ :

httpfsfilesystem when calls client for getblocklocations , httpfs recieves 
BlockLocations[] as client converts the Locatedblocks (which was recieved from 
NN ) to BlockLocations[] .

so httpfs sends BlockLocations[] to webhdfs and webhdfs expects LocatedBlocks  
and fails to parse

 

any suggestions [~elgoiri] ?

> Add GETFILEBLOCKLOCATIONS operation to HttpFS
> -
>
> Key: HDFS-6874
> URL: https://issues.apache.org/jira/browse/HDFS-6874
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 2.4.1, 2.7.3
>Reporter: Gao Zhong Liang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6874-1.patch, HDFS-6874-branch-2.6.0.patch, 
> HDFS-6874.02.patch, HDFS-6874.03.patch, HDFS-6874.04.patch, 
> HDFS-6874.05.patch, HDFS-6874.06.patch, HDFS-6874.07.patch, 
> HDFS-6874.08.patch, HDFS-6874.09.patch, HDFS-6874.10.patch, HDFS-6874.patch
>
>
> GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already 
> supported in WebHDFS.  For the request of GETFILEBLOCKLOCATIONS in 
> org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far:
> ...
>  case GETFILEBLOCKLOCATIONS: {
> response = Response.status(Response.Status.BAD_REQUEST).build();
> break;
>   }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992819#comment-16992819
 ] 

Surendra Singh Lilhore commented on HDFS-15045:
---

+1, will wait for build.

> DataStreamer#createBlockOutputStream() should log exception in warn.
> 
>
> Key: HDFS-15045
> URL: https://issues.apache.org/jira/browse/HDFS-15045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15045.001.patch
>
>
> {code:java}
> } catch (IOException ie) {
> if (!errorState.isRestartingNode()) {
>   LOG.info("Exception in createBlockOutputStream " + this, ie);
> } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-15045:
--
Status: Patch Available  (was: Open)

> DataStreamer#createBlockOutputStream() should log exception in warn.
> 
>
> Key: HDFS-15045
> URL: https://issues.apache.org/jira/browse/HDFS-15045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15045.001.patch
>
>
> {code:java}
> } catch (IOException ie) {
> if (!errorState.isRestartingNode()) {
>   LOG.info("Exception in createBlockOutputStream " + this, ie);
> } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14997) BPServiceActor process command from NameNode asynchronously

2019-12-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992773#comment-16992773
 ] 

Íñigo Goiri commented on HDFS-14997:


+1 on [^HDFS-14997.005.patch].
This is pretty core, so I'll give it a couple days before committing in case 
somebody else has comments.

> BPServiceActor process command from NameNode asynchronously
> ---
>
> Key: HDFS-14997
> URL: https://issues.apache.org/jira/browse/HDFS-14997
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-14997.001.patch, HDFS-14997.002.patch, 
> HDFS-14997.003.patch, HDFS-14997.004.patch, HDFS-14997.005.patch
>
>
> There are two core functions, report(#sendHeartbeat, #blockReport, 
> #cacheReport) and #processCommand in #BPServiceActor main process flow. If 
> processCommand cost long time it will block send report flow. Meanwhile 
> processCommand could cost long time(over 1000s the worst case I meet) when IO 
> load  of DataNode is very high. Since some IO operations are under 
> #datasetLock, So it has to wait to acquire #datasetLock long time when 
> process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat 
> will not send to NameNode in-time, and trigger other disasters.
> I propose to improve #processCommand asynchronously and not block 
> #BPServiceActor to send heartbeat back to NameNode when meet high IO load.
> Notes:
> 1. Lifeline could be one effective solution, however some old branches are 
> not support this feature.
> 2. IO operations under #datasetLock is another issue, I think we should solve 
> it at another JIRA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation

2019-12-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992766#comment-16992766
 ] 

Íñigo Goiri commented on HDFS-14854:


[~weichiu], marking it as experimental sounds good.

+1 on  [^HDFS-14854.014.patch].

> Create improved decommission monitor implementation
> ---
>
> Key: HDFS-14854
> URL: https://issues.apache.org/jira/browse/HDFS-14854
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: 012_to_013_changes.diff, 
> Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, HDFS-14854.002.patch, 
> HDFS-14854.003.patch, HDFS-14854.004.patch, HDFS-14854.005.patch, 
> HDFS-14854.006.patch, HDFS-14854.007.patch, HDFS-14854.008.patch, 
> HDFS-14854.009.patch, HDFS-14854.010.patch, HDFS-14854.011.patch, 
> HDFS-14854.012.patch, HDFS-14854.013.patch, HDFS-14854.014.patch
>
>
> In HDFS-13157, we discovered a series of problems with the current 
> decommission monitor implementation, such as:
>  * Blocks are replicated sequentially disk by disk and node by node, and 
> hence the load is not spread well across the cluster
>  * Adding a node for decommission can cause the namenode write lock to be 
> held for a long time.
>  * Decommissioning nodes floods the replication queue and under replicated 
> blocks from a future node or disk failure may way for a long time before they 
> are replicated.
>  * Blocks pending replication are checked many times under a write lock 
> before they are sufficiently replicate, wasting resources
> In this Jira I propose to create a new implementation of the decommission 
> monitor that resolves these issues. As it will be difficult to prove one 
> implementation is better than another, the new implementation can be enabled 
> or disabled giving the option of the existing implementation or the new one.
> I will attach a pdf with some more details on the design and then a version 1 
> patch shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-12-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992765#comment-16992765
 ] 

Íñigo Goiri commented on HDFS-14908:


The improvement in performance is not too crazy but I guess fixing 
functionality is enough.
Let's fix the checkstyle though.

> LeaseManager should check parent-child relationship when filter open files.
> ---
>
> Key: HDFS-14908
> URL: https://issues.apache.org/jira/browse/HDFS-14908
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, 
> HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, 
> HDFS-14908.006.patch, HDFS-14908.007.patch, HDFS-14908.008.patch, 
> HDFS-14908.009.patch, HDFS-14908.TestV4.patch, Test.java, TestV2.java, 
> TestV3.java
>
>
> Now when doing listOpenFiles(), LeaseManager only checks whether the filter 
> path is the prefix of the open files. We should check whether the filter path 
> is the parent/ancestor of the open files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation

2019-12-10 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992733#comment-16992733
 ] 

Wei-Chiu Chuang commented on HDFS-14854:


I think this patch is ready to go as long as we mark it as an "experimental" 
feature and users is expected to hit snags.
Plan to commit it in trunk for 3.3.0 release.

> Create improved decommission monitor implementation
> ---
>
> Key: HDFS-14854
> URL: https://issues.apache.org/jira/browse/HDFS-14854
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: 012_to_013_changes.diff, 
> Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, HDFS-14854.002.patch, 
> HDFS-14854.003.patch, HDFS-14854.004.patch, HDFS-14854.005.patch, 
> HDFS-14854.006.patch, HDFS-14854.007.patch, HDFS-14854.008.patch, 
> HDFS-14854.009.patch, HDFS-14854.010.patch, HDFS-14854.011.patch, 
> HDFS-14854.012.patch, HDFS-14854.013.patch, HDFS-14854.014.patch
>
>
> In HDFS-13157, we discovered a series of problems with the current 
> decommission monitor implementation, such as:
>  * Blocks are replicated sequentially disk by disk and node by node, and 
> hence the load is not spread well across the cluster
>  * Adding a node for decommission can cause the namenode write lock to be 
> held for a long time.
>  * Decommissioning nodes floods the replication queue and under replicated 
> blocks from a future node or disk failure may way for a long time before they 
> are replicated.
>  * Blocks pending replication are checked many times under a write lock 
> before they are sufficiently replicate, wasting resources
> In this Jira I propose to create a new implementation of the decommission 
> monitor that resolves these issues. As it will be difficult to prove one 
> implementation is better than another, the new implementation can be enabled 
> or disabled giving the option of the existing implementation or the new one.
> I will attach a pdf with some more details on the design and then a version 1 
> patch shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-12-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992735#comment-16992735
 ] 

Hadoop QA commented on HDFS-14908:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
13s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 275 unchanged - 0 fixed = 276 total (was 275) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 29s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 35s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}162m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
|   | hadoop.hdfs.server.namenode.TestFsck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14908 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988423/HDFS-14908.009.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 91ddbf4a8a2d 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d4bde13 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28492/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28492/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 

[jira] [Commented] (HDFS-14861) Reset LowRedundancyBlocks Iterator periodically

2019-12-10 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992724#comment-16992724
 ] 

Stephen O'Donnell commented on HDFS-14861:
--

I discussed this issue with Wei-Chiu offline and we had a couple of concerns:

1. If we are resetting the iterator periodically, and there are a lot of 
missing blocks, how will that affect things

2. Is there any better way we can detect if the iterator needs reset.

For (1) - missing / corrupt blocks go into the lowest priority queue which is 
not accessed by the iterators in question here. The iterators are used in 
chooseLowRedundancyBlocks:

{code}
  synchronized List> chooseLowRedundancyBlocks(
  int blocksToProcess) {
final List> blocksToReconstruct = new ArrayList<>(LEVEL);

int count = 0;
int priority = 0;
for (; count < blocksToProcess && priority < LEVEL; priority++) {
  if (priority == QUEUE_WITH_CORRUPT_BLOCKS) {
// do not choose corrupted blocks.
continue;
  }

  // Go through all blocks that need reconstructions with current priority.
  // Set the iterator to the first unprocessed block at this priority level
  final Iterator i = priorityQueues.get(priority).getBookmark();
  ...
{code}

The corrupt / missing blocks will all be in QUEUE_WITH_CORRUPT_BLOCKS and hence 
are not processed by this method. Therefore we don't need to worry about them 
with this change.

For (2) - it is difficult to come up with something other than a time based 
metric. The reason is that each queue is effectively a double linked list and 
the iterator bookmark just points to the next element. Given that element, we 
have no knowledge as to how many blocks are behind that point, or ahead of it. 
Ideally we want to reset the iterator if there is some threshold of blocks 
behind the pointer, as those are the blocks which got skipped for some reason. 
The only way to see how many blocks are behind is to read the list from the 
start until you encounter the same element as the iterator returns which would 
not be very efficient. The easy solution is to simply reset the iterator to the 
start after some amount of time, but its hard to know what the best period of 
time would be.

It may be useful to create a command to dump the contents of lowReduncanyBlocks 
in a separate Jira which could give some further insights into the queues, 
especially if decommission is stuck for seemingly no reason and also let us see 
often this problem occurs on a real cluster.

> Reset LowRedundancyBlocks Iterator periodically
> ---
>
> Key: HDFS-14861
> URL: https://issues.apache.org/jira/browse/HDFS-14861
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: decommission
> Attachments: HDFS-14861.001.patch, HDFS-14861.002.patch
>
>
> When the namenode needs to schedule blocks for reconstruction, the blocks are 
> placed into the neededReconstruction object in the BlockManager. This is an 
> instance of LowRedundancyBlocks, which maintains a list of priority queues 
> where the blocks are held until they are scheduled for reconstruction / 
> replication.
> Every 3 seconds, by default, a number of blocks are retrieved from 
> LowRedundancyBlocks. The method 
> LowRedundancyBlocks.chooseLowRedundancyBlocks() is used to retrieve the next 
> set of blocks using a bookmarked iterator. Each call to this method moves the 
> iterator forward. The number of blocks retrieved is governed by the formula:
> number_of_live_nodes * dfs.namenode.replication.work.multiplier.per.iteration 
> (default 2)
> Then the namenode attempts to schedule those blocks on datanodes, but each 
> datanode has a limit of how many blocks can be queued against it (controlled 
> by dfs.namenode.replication.max-streams) so all of the retrieved blocks may 
> not be scheduled. There may be other block availability reasons the blocks 
> are not scheduled too.
> As the iterator in chooseLowRedundancyBlocks() always moves forward, the 
> blocks which were not scheduled are not retried until the end of the queue is 
> reached and the iterator is reset.
> If the replication queue is very large (eg several nodes are being 
> decommissioned) or if blocks are being continuously added to the replication 
> queue (eg nodes decommission using the proposal in HDFS-14854) it may take a 
> very long time for the iterator to be reset to the start.
> The result of this, could be a few blocks for a decommissioning or entering 
> maintenance mode node getting left behind and it taking many hours or even 
> days for them to be retried, and this could stop decommission completing.
> With this Jira, I would 

[jira] [Comment Edited] (HDFS-15032) Balancer crashes when it fails to contact an unavailable NN via ObserverReadProxyProvider

2019-12-10 Thread Erik Krogen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992719#comment-16992719
 ] 

Erik Krogen edited comment on HDFS-15032 at 12/10/19 4:35 PM:
--

Thanks for the info [~shv], good to know. I've removed the {{toString()}} stuff 
in v5.

After seeing the Jenkins failure, I experimented and found that the new test 
timed out half of the time (5 of 10 runs) when run on my machine, but it 
succeeded every time when I increased the timeout to 2 minutes. I think it just 
needs longer since there is more overhead involved with the failure handling.

To avoid spurious failures, I increased the timeout for the failure test to 3 
minutes, and for the non-failure observer test to 2 minutes.


was (Author: xkrogen):
Thanks for the info [~shv], good to know. I've removed the {{toString()}} stuff 
in v5.

After seeing the Jenkins failure, I experimented and found that the new test 
timing out half of the time (5 of 10 runs) when run on my machine, but it 
succeeded every time when I increased the timeout to 2 minutes. I think it just 
needs longer since there is more overhead involved with the failure handling.

To avoid spurious failures, I increased the timeout for the failure test to 3 
minutes, and for the non-failure observer test to 2 minutes.

> Balancer crashes when it fails to contact an unavailable NN via 
> ObserverReadProxyProvider
> -
>
> Key: HDFS-15032
> URL: https://issues.apache.org/jira/browse/HDFS-15032
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.10.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-15032.000.patch, HDFS-15032.001.patch, 
> HDFS-15032.002.patch, HDFS-15032.003.patch, HDFS-15032.004.patch, 
> HDFS-15032.005.patch, debugger_with_tostring.png, 
> debugger_without_tostring.png
>
>
> When trying to run the Balancer using ObserverReadProxyProvider (to allow it 
> to read from the Observer Node as described in HDFS-14979), if one of the NNs 
> isn't running, the Balancer will crash.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15032) Balancer crashes when it fails to contact an unavailable NN via ObserverReadProxyProvider

2019-12-10 Thread Erik Krogen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992719#comment-16992719
 ] 

Erik Krogen commented on HDFS-15032:


Thanks for the info [~shv], good to know. I've removed the {{toString()}} stuff 
in v5.

After seeing the Jenkins failure, I experimented and found that the new test 
timing out half of the time (5 of 10 runs) when run on my machine, but it 
succeeded every time when I increased the timeout to 2 minutes. I think it just 
needs longer since there is more overhead involved with the failure handling.

To avoid spurious failures, I increased the timeout for the failure test to 3 
minutes, and for the non-failure observer test to 2 minutes.

> Balancer crashes when it fails to contact an unavailable NN via 
> ObserverReadProxyProvider
> -
>
> Key: HDFS-15032
> URL: https://issues.apache.org/jira/browse/HDFS-15032
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.10.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-15032.000.patch, HDFS-15032.001.patch, 
> HDFS-15032.002.patch, HDFS-15032.003.patch, HDFS-15032.004.patch, 
> HDFS-15032.005.patch, debugger_with_tostring.png, 
> debugger_without_tostring.png
>
>
> When trying to run the Balancer using ObserverReadProxyProvider (to allow it 
> to read from the Observer Node as described in HDFS-14979), if one of the NNs 
> isn't running, the Balancer will crash.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15032) Balancer crashes when it fails to contact an unavailable NN via ObserverReadProxyProvider

2019-12-10 Thread Erik Krogen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-15032:
---
Attachment: HDFS-15032.005.patch

> Balancer crashes when it fails to contact an unavailable NN via 
> ObserverReadProxyProvider
> -
>
> Key: HDFS-15032
> URL: https://issues.apache.org/jira/browse/HDFS-15032
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.10.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-15032.000.patch, HDFS-15032.001.patch, 
> HDFS-15032.002.patch, HDFS-15032.003.patch, HDFS-15032.004.patch, 
> HDFS-15032.005.patch, debugger_with_tostring.png, 
> debugger_without_tostring.png
>
>
> When trying to run the Balancer using ObserverReadProxyProvider (to allow it 
> to read from the Observer Node as described in HDFS-14979), if one of the NNs 
> isn't running, the Balancer will crash.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15045:
--
Attachment: HDFS-15045.001.patch

> DataStreamer#createBlockOutputStream() should log exception in warn.
> 
>
> Key: HDFS-15045
> URL: https://issues.apache.org/jira/browse/HDFS-15045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15045.001.patch
>
>
> {code:java}
> } catch (IOException ie) {
> if (!errorState.isRestartingNode()) {
>   LOG.info("Exception in createBlockOutputStream " + this, ie);
> } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-12-10 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-14908:
---
Attachment: HDFS-14908.009.patch

> LeaseManager should check parent-child relationship when filter open files.
> ---
>
> Key: HDFS-14908
> URL: https://issues.apache.org/jira/browse/HDFS-14908
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, 
> HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, 
> HDFS-14908.006.patch, HDFS-14908.007.patch, HDFS-14908.008.patch, 
> HDFS-14908.009.patch, HDFS-14908.TestV4.patch, Test.java, TestV2.java, 
> TestV3.java
>
>
> Now when doing listOpenFiles(), LeaseManager only checks whether the filter 
> path is the prefix of the open files. We should check whether the filter path 
> is the parent/ancestor of the open files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-12-10 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992586#comment-16992586
 ] 

Jinglun commented on HDFS-14908:


Thanks [~elgoiri] [~hemanthboyina] your comments ! Sorry for my late response. 
I did a test and the result shows DFSUtil.isParentEntry() and 
String.startsWith() are nearly the same.
||Time||1,000,000,000||
|DFSUtil.isParentEntry()|6077ms|
|String.startsWith()|6590ms|

 

Follow [~elgoiri] suggestions and upload v09.

> LeaseManager should check parent-child relationship when filter open files.
> ---
>
> Key: HDFS-14908
> URL: https://issues.apache.org/jira/browse/HDFS-14908
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, 
> HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, 
> HDFS-14908.006.patch, HDFS-14908.007.patch, HDFS-14908.008.patch, 
> HDFS-14908.TestV4.patch, Test.java, TestV2.java, TestV3.java
>
>
> Now when doing listOpenFiles(), LeaseManager only checks whether the filter 
> path is the prefix of the open files. We should check whether the filter path 
> is the parent/ancestor of the open files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-10 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992496#comment-16992496
 ] 

Takanobu Asanuma commented on HDFS-14983:
-

Thanks for working on this, [~risyomei]. The 004 patch almost seems good. I 
have one comment.

 * It would be better if the result of the dfsrouteradmin command uses the 
return of {{RouterAdminServer#refreshSuperUserGroupsConfiguration()}}.

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.003.patch, 
> HDFS-14983.004.patch, HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14852) Remove of LowRedundancyBlocks do NOT remove the block from all queues

2019-12-10 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992474#comment-16992474
 ] 

Stephen O'Donnell commented on HDFS-14852:
--

I think that we should try to delete from all queues when the block is not 
found at the passed in LEVEL.

For deletes, as it passes level as "LowRedundancyBlocks.LEVEL", it would 
therefore always attempt to delete from all queues as it does not check a 
specific level at all.

We should see if [~kihwal] agrees with this approach too, as he previously 
suggested a different approach.

> Remove of LowRedundancyBlocks do NOT remove the block from all queues
> -
>
> Key: HDFS-14852
> URL: https://issues.apache.org/jira/browse/HDFS-14852
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: CorruptBlocksMismatch.png, HDFS-14852.001.patch, 
> HDFS-14852.002.patch, HDFS-14852.003.patch, HDFS-14852.004.patch, 
> HDFS-14852.005.patch, screenshot-1.png
>
>
> LowRedundancyBlocks.java
> {code:java}
> // Some comments here
> if(priLevel >= 0 && priLevel < LEVEL
> && priorityQueues.get(priLevel).remove(block)) {
>   NameNode.blockStateChangeLog.debug(
>   "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block {}"
>   + " from priority queue {}",
>   block, priLevel);
>   decrementBlockStat(block, priLevel, oldExpectedReplicas);
>   return true;
> } else {
>   // Try to remove the block from all queues if the block was
>   // not found in the queue for the given priority level.
>   for (int i = 0; i < LEVEL; i++) {
> if (i != priLevel && priorityQueues.get(i).remove(block)) {
>   NameNode.blockStateChangeLog.debug(
>   "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block" +
>   " {} from priority queue {}", block, i);
>   decrementBlockStat(block, i, oldExpectedReplicas);
>   return true;
> }
>   }
> }
> return false;
>   }
> {code}
> Source code is above, the comments as follow
> {quote}
>   // Try to remove the block from all queues if the block was
>   // not found in the queue for the given priority level.
> {quote}
> The function "remove" does NOT remove the block from all queues.
> Function add from LowRedundancyBlocks.java is used on some places and maybe 
> one block in two or more queues.
> We found that corrupt blocks mismatch corrupt files on NN web UI. Maybe it is 
> related to this.
> Upload initial patch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992466#comment-16992466
 ] 

Hadoop QA commented on HDFS-15041:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m  
3s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 33s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}147m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDeadNodeDetection |
|   | hadoop.hdfs.server.namenode.TestFsck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15041 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988400/HDFS-15041.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 1de984346370 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c473337 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28491/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 

[jira] [Commented] (HDFS-14852) Remove of LowRedundancyBlocks do NOT remove the block from all queues

2019-12-10 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992433#comment-16992433
 ] 

Fei Hui commented on HDFS-14852:


[~sodonnell] Thanks
Do you mean that we should remove it from all queues for deletes? 
If no other comments, we move forward like this.

> Remove of LowRedundancyBlocks do NOT remove the block from all queues
> -
>
> Key: HDFS-14852
> URL: https://issues.apache.org/jira/browse/HDFS-14852
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: CorruptBlocksMismatch.png, HDFS-14852.001.patch, 
> HDFS-14852.002.patch, HDFS-14852.003.patch, HDFS-14852.004.patch, 
> HDFS-14852.005.patch, screenshot-1.png
>
>
> LowRedundancyBlocks.java
> {code:java}
> // Some comments here
> if(priLevel >= 0 && priLevel < LEVEL
> && priorityQueues.get(priLevel).remove(block)) {
>   NameNode.blockStateChangeLog.debug(
>   "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block {}"
>   + " from priority queue {}",
>   block, priLevel);
>   decrementBlockStat(block, priLevel, oldExpectedReplicas);
>   return true;
> } else {
>   // Try to remove the block from all queues if the block was
>   // not found in the queue for the given priority level.
>   for (int i = 0; i < LEVEL; i++) {
> if (i != priLevel && priorityQueues.get(i).remove(block)) {
>   NameNode.blockStateChangeLog.debug(
>   "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block" +
>   " {} from priority queue {}", block, i);
>   decrementBlockStat(block, i, oldExpectedReplicas);
>   return true;
> }
>   }
> }
> return false;
>   }
> {code}
> Source code is above, the comments as follow
> {quote}
>   // Try to remove the block from all queues if the block was
>   // not found in the queue for the given priority level.
> {quote}
> The function "remove" does NOT remove the block from all queues.
> Function add from LowRedundancyBlocks.java is used on some places and maybe 
> one block in two or more queues.
> We found that corrupt blocks mismatch corrupt files on NN web UI. Maybe it is 
> related to this.
> Upload initial patch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992412#comment-16992412
 ] 

Hadoop QA commented on HDFS-14983:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
56s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
43s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  3m  9s{color} | 
{color:red} hadoop-hdfs-project generated 3 new + 16 unchanged - 3 fixed = 19 
total (was 19) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 30s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m  9s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
50s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}198m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDeadNodeDetection |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.server.namenode.TestFsck |
|   | hadoop.hdfs.server.federation.router.TestRouterFaultTolerant |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14983 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988391/HDFS-14983.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
| 

[jira] [Commented] (HDFS-14852) Remove of LowRedundancyBlocks do NOT remove the block from all queues

2019-12-10 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992362#comment-16992362
 ] 

Stephen O'Donnell commented on HDFS-14852:
--

Thinking on this issue a little more, I believe it makes sense to delete the 
block from all queues if a match is not found in queue for the passed in 
'LEVEL'. It simplifies the code a little and makes the intent of the method 
easier to understand.

My reasoning is:

1) For deletes, the majority of the time, there will be nothing in the 
lowRedundancyQueue, so right now, it will iterate over all queues. In the rare 
case, it will stop after finding an entry. Optimising this by first deleting 
from CORRUPT (1 search) and then maybe finding something in the other 4 queues 
(average 2 searches) will likely result in searching all queues most of the 
time anyway.

2) For non-deletes, the calls to remove pass the LEVEL which is usually 
correct, except in rare circumstances. Therefore it will get an exact match on 
the suggested queue and not iterate any queues, but in the rare case when LEVEL 
is not correct, we save little by stopping the search early.

Therefore I think something like this would work well with basically the same 
performance as the existing code:

{code}
  boolean remove(BlockInfo block, int priLevel, int oldExpectedReplicas) {
if(priLevel >= 0 && priLevel < LEVEL
&& priorityQueues.get(priLevel).remove(block)) {
  NameNode.blockStateChangeLog.debug(
  "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block {}"
  + " from priority queue {}",
  block, priLevel);
  decrementBlockStat(block, priLevel, oldExpectedReplicas);
  return true;
} else {
  // Try to remove the block from all queues if the block was
  // not found in the queue for the given priority level.
  boolean found = false;
  for (int i = 0; i < LEVEL; i++) {
if (i != priLevel && priorityQueues.get(i).remove(block)) {
  NameNode.blockStateChangeLog.debug(
  "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block" +
  " {} from priority queue {}", block, i);
  decrementBlockStat(block, i, oldExpectedReplicas);
  found = true;
}
  }
   return found;
}
  }
{code}

Adding the other change to blockManager makes sense too:

{code}
-  if (bi == null) {
+  if (bi == null || bi.isDeleted()) {
{code}

> Remove of LowRedundancyBlocks do NOT remove the block from all queues
> -
>
> Key: HDFS-14852
> URL: https://issues.apache.org/jira/browse/HDFS-14852
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: CorruptBlocksMismatch.png, HDFS-14852.001.patch, 
> HDFS-14852.002.patch, HDFS-14852.003.patch, HDFS-14852.004.patch, 
> HDFS-14852.005.patch, screenshot-1.png
>
>
> LowRedundancyBlocks.java
> {code:java}
> // Some comments here
> if(priLevel >= 0 && priLevel < LEVEL
> && priorityQueues.get(priLevel).remove(block)) {
>   NameNode.blockStateChangeLog.debug(
>   "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block {}"
>   + " from priority queue {}",
>   block, priLevel);
>   decrementBlockStat(block, priLevel, oldExpectedReplicas);
>   return true;
> } else {
>   // Try to remove the block from all queues if the block was
>   // not found in the queue for the given priority level.
>   for (int i = 0; i < LEVEL; i++) {
> if (i != priLevel && priorityQueues.get(i).remove(block)) {
>   NameNode.blockStateChangeLog.debug(
>   "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block" +
>   " {} from priority queue {}", block, i);
>   decrementBlockStat(block, i, oldExpectedReplicas);
>   return true;
> }
>   }
> }
> return false;
>   }
> {code}
> Source code is above, the comments as follow
> {quote}
>   // Try to remove the block from all queues if the block was
>   // not found in the queue for the given priority level.
> {quote}
> The function "remove" does NOT remove the block from all queues.
> Function add from LowRedundancyBlocks.java is used on some places and maybe 
> one block in two or more queues.
> We found that corrupt blocks mismatch corrupt files on NN web UI. Maybe it is 
> related to this.
> Upload initial patch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992341#comment-16992341
 ] 

Surendra Singh Lilhore commented on HDFS-15045:
---

Some time client side only warn log will be enabled, so user will not be able 
to get the reson for pipline failure..
{code:java}
2019-12-09 21:33:50,214 INFO hdfs.DataStreamer: Exception in 
createBlockOutputStream blk_1088983977_15246062
java.net.BindException: Cannot assign requested address
 at sun.nio.ch.Net.connect0(Native Method)
 at sun.nio.ch.Net.connect(Net.java:454)
 at sun.nio.ch.Net.connect(Net.java:446)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
 at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
 at 
org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:255)
 at 
org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1789)
 at 
org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1743)
 at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718){code}

> DataStreamer#createBlockOutputStream() should log exception in warn.
> 
>
> Key: HDFS-15045
> URL: https://issues.apache.org/jira/browse/HDFS-15045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> {code:java}
> } catch (IOException ie) {
> if (!errorState.isRestartingNode()) {
>   LOG.info("Exception in createBlockOutputStream " + this, ie);
> } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-15045:
-

Assignee: Ravuri Sushma sree

> DataStreamer#createBlockOutputStream() should log exception in warn.
> 
>
> Key: HDFS-15045
> URL: https://issues.apache.org/jira/browse/HDFS-15045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> {code:java}
> } catch (IOException ie) {
> if (!errorState.isRestartingNode()) {
>   LOG.info("Exception in createBlockOutputStream " + this, ie);
> } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Surendra Singh Lilhore (Jira)
Surendra Singh Lilhore created HDFS-15045:
-

 Summary: DataStreamer#createBlockOutputStream() should log 
exception in warn.
 Key: HDFS-15045
 URL: https://issues.apache.org/jira/browse/HDFS-15045
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: dfsclient
Affects Versions: 3.1.1
Reporter: Surendra Singh Lilhore


{code:java}
} catch (IOException ie) {
if (!errorState.isRestartingNode()) {
  LOG.info("Exception in createBlockOutputStream " + this, ie);
} {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-10 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15041:
-
  Attachment: HDFS-15041.002.patch
Release Note: fix checkstyle
  Status: Patch Available  (was: Open)

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch, HDFS-15041.002.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-10 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15041:
-
Status: Open  (was: Patch Available)

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992302#comment-16992302
 ] 

Hadoop QA commented on HDFS-15041:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
44s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 32s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
13s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 44s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 559 unchanged - 0 fixed = 560 total (was 559) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 25s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 24s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}166m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDeadNodeDetection |
|   | hadoop.hdfs.server.namenode.TestFsck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15041 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988388/HDFS-15041.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 87ae6c24d58a 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 4dffd81 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| findbugs |