[jira] [Commented] (HDFS-15169) RBF: Router FSCK should consider the mount table

2020-03-31 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072403#comment-17072403
 ] 

Xiaoqiao He commented on HDFS-15169:


Thanks [~ayushtkn] for your helpful review comments.
{quote}Presently in the output, The path name is coming as of destination:
{quote}
Exactly it is unexpected, this output is print by NameNode, and we have 
resolved it at Router. Do we need to replace it at Router when receive result 
from NN? I am not sure if it is graceful. After rough scan, it seems that DFCK 
without RBF have the same issue? Any suggestions?
The original thought of redirect request to all active downstream NN is for the 
inner path of mount point. It seems not very accurate, especially for FNF path.
About failover, is it proper to fix by retry?
Other comments are good catches. Will fix it later. Thanks again. 

> RBF: Router FSCK should consider the mount table
> 
>
> Key: HDFS-15169
> URL: https://issues.apache.org/jira/browse/HDFS-15169
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15169.001.patch, HDFS-15169.002.patch, 
> HDFS-15169.003.patch, HDFS-15169.004.patch, HDFS-15169.005.patch
>
>
> HDFS-13989 implemented FSCK to DFSRouter, however, it just redirects the 
> requests to all the active downstream NameNodes for now. The DFSRouter should 
> consider the mount table when redirecting the requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only

2020-03-31 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072393#comment-17072393
 ] 

Xiaoqiao He commented on HDFS-15051:


Thanks [~ayushtkn] and [~elgoiri] for your comments. v010 add javadoc for 
#addMountTableEntry and remove returns of #checkMountTablePermission, also 
rename to #checkMountTableEntryPermission. PTAL. Thanks.

> RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
> -
>
> Key: HDFS-15051
> URL: https://issues.apache.org/jira/browse/HDFS-15051
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, 
> HDFS-15051.003.patch, HDFS-15051.004.patch, HDFS-15051.005.patch, 
> HDFS-15051.006.patch, HDFS-15051.007.patch, HDFS-15051.008.patch, 
> HDFS-15051.009.patch, HDFS-15051.010.patch
>
>
> The current permission checker of #MountTableStoreImpl is not very restrict. 
> In some case, any user could add/update/remove MountTableEntry without the 
> expected permission checking.
> The following code segment try to check permission when operate 
> MountTableEntry, however mountTable object is from Client/RouterAdmin 
> {{MountTable mountTable = request.getEntry();}}, and user could pass any mode 
> which could bypass the permission checker.
> {code:java}
>   public void checkPermission(MountTable mountTable, FsAction access)
>   throws AccessControlException {
> if (isSuperUser()) {
>   return;
> }
> FsPermission mode = mountTable.getMode();
> if (getUser().equals(mountTable.getOwnerName())
> && mode.getUserAction().implies(access)) {
>   return;
> }
> if (isMemberOfGroup(mountTable.getGroupName())
> && mode.getGroupAction().implies(access)) {
>   return;
> }
> if (!getUser().equals(mountTable.getOwnerName())
> && !isMemberOfGroup(mountTable.getGroupName())
> && mode.getOtherAction().implies(access)) {
>   return;
> }
> throw new AccessControlException(
> "Permission denied while accessing mount table "
> + mountTable.getSourcePath()
> + ": user " + getUser() + " does not have " + access.toString()
> + " permissions.");
>   }
> {code}
> I just propose revoke WRITE MountTableEntry privilege to super user only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only

2020-03-31 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-15051:
---
Attachment: HDFS-15051.010.patch

> RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
> -
>
> Key: HDFS-15051
> URL: https://issues.apache.org/jira/browse/HDFS-15051
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, 
> HDFS-15051.003.patch, HDFS-15051.004.patch, HDFS-15051.005.patch, 
> HDFS-15051.006.patch, HDFS-15051.007.patch, HDFS-15051.008.patch, 
> HDFS-15051.009.patch, HDFS-15051.010.patch
>
>
> The current permission checker of #MountTableStoreImpl is not very restrict. 
> In some case, any user could add/update/remove MountTableEntry without the 
> expected permission checking.
> The following code segment try to check permission when operate 
> MountTableEntry, however mountTable object is from Client/RouterAdmin 
> {{MountTable mountTable = request.getEntry();}}, and user could pass any mode 
> which could bypass the permission checker.
> {code:java}
>   public void checkPermission(MountTable mountTable, FsAction access)
>   throws AccessControlException {
> if (isSuperUser()) {
>   return;
> }
> FsPermission mode = mountTable.getMode();
> if (getUser().equals(mountTable.getOwnerName())
> && mode.getUserAction().implies(access)) {
>   return;
> }
> if (isMemberOfGroup(mountTable.getGroupName())
> && mode.getGroupAction().implies(access)) {
>   return;
> }
> if (!getUser().equals(mountTable.getOwnerName())
> && !isMemberOfGroup(mountTable.getGroupName())
> && mode.getOtherAction().implies(access)) {
>   return;
> }
> throw new AccessControlException(
> "Permission denied while accessing mount table "
> + mountTable.getSourcePath()
> + ": user " + getUser() + " does not have " + access.toString()
> + " permissions.");
>   }
> {code}
> I just propose revoke WRITE MountTableEntry privilege to super user only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only

2020-03-31 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-15051:
---
Attachment: (was: HDFS-15051.010.patch)

> RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
> -
>
> Key: HDFS-15051
> URL: https://issues.apache.org/jira/browse/HDFS-15051
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, 
> HDFS-15051.003.patch, HDFS-15051.004.patch, HDFS-15051.005.patch, 
> HDFS-15051.006.patch, HDFS-15051.007.patch, HDFS-15051.008.patch, 
> HDFS-15051.009.patch
>
>
> The current permission checker of #MountTableStoreImpl is not very restrict. 
> In some case, any user could add/update/remove MountTableEntry without the 
> expected permission checking.
> The following code segment try to check permission when operate 
> MountTableEntry, however mountTable object is from Client/RouterAdmin 
> {{MountTable mountTable = request.getEntry();}}, and user could pass any mode 
> which could bypass the permission checker.
> {code:java}
>   public void checkPermission(MountTable mountTable, FsAction access)
>   throws AccessControlException {
> if (isSuperUser()) {
>   return;
> }
> FsPermission mode = mountTable.getMode();
> if (getUser().equals(mountTable.getOwnerName())
> && mode.getUserAction().implies(access)) {
>   return;
> }
> if (isMemberOfGroup(mountTable.getGroupName())
> && mode.getGroupAction().implies(access)) {
>   return;
> }
> if (!getUser().equals(mountTable.getOwnerName())
> && !isMemberOfGroup(mountTable.getGroupName())
> && mode.getOtherAction().implies(access)) {
>   return;
> }
> throw new AccessControlException(
> "Permission denied while accessing mount table "
> + mountTable.getSourcePath()
> + ": user " + getUser() + " does not have " + access.toString()
> + " permissions.");
>   }
> {code}
> I just propose revoke WRITE MountTableEntry privilege to super user only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only

2020-03-31 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-15051:
---
Attachment: HDFS-15051.010.patch

> RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
> -
>
> Key: HDFS-15051
> URL: https://issues.apache.org/jira/browse/HDFS-15051
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, 
> HDFS-15051.003.patch, HDFS-15051.004.patch, HDFS-15051.005.patch, 
> HDFS-15051.006.patch, HDFS-15051.007.patch, HDFS-15051.008.patch, 
> HDFS-15051.009.patch, HDFS-15051.010.patch
>
>
> The current permission checker of #MountTableStoreImpl is not very restrict. 
> In some case, any user could add/update/remove MountTableEntry without the 
> expected permission checking.
> The following code segment try to check permission when operate 
> MountTableEntry, however mountTable object is from Client/RouterAdmin 
> {{MountTable mountTable = request.getEntry();}}, and user could pass any mode 
> which could bypass the permission checker.
> {code:java}
>   public void checkPermission(MountTable mountTable, FsAction access)
>   throws AccessControlException {
> if (isSuperUser()) {
>   return;
> }
> FsPermission mode = mountTable.getMode();
> if (getUser().equals(mountTable.getOwnerName())
> && mode.getUserAction().implies(access)) {
>   return;
> }
> if (isMemberOfGroup(mountTable.getGroupName())
> && mode.getGroupAction().implies(access)) {
>   return;
> }
> if (!getUser().equals(mountTable.getOwnerName())
> && !isMemberOfGroup(mountTable.getGroupName())
> && mode.getOtherAction().implies(access)) {
>   return;
> }
> throw new AccessControlException(
> "Permission denied while accessing mount table "
> + mountTable.getSourcePath()
> + ": user " + getUser() + " does not have " + access.toString()
> + " permissions.");
>   }
> {code}
> I just propose revoke WRITE MountTableEntry privilege to super user only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15248) Make the maximum number of ACLs entries configurable

2020-03-31 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15248:

Resolution: Not A Problem
Status: Resolved  (was: Patch Available)

> Make the maximum number of ACLs entries configurable
> 
>
> Key: HDFS-15248
> URL: https://issues.apache.org/jira/browse/HDFS-15248
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15248.001.patch, HDFS-15248.002.patch, 
> HDFS-15248.patch
>
>
> For big cluster, the hardcode 32 of ACLs maximum number is not enough, make 
> it configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15248) Make the maximum number of ACLs entries configurable

2020-03-31 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072287#comment-17072287
 ] 

Yang Yun commented on HDFS-15248:
-

Thanks [~sodonnell] and  [~weichiu]  for the clarifications.
Thanks [~elgoiri] for the review.
I'll close this jira.

> Make the maximum number of ACLs entries configurable
> 
>
> Key: HDFS-15248
> URL: https://issues.apache.org/jira/browse/HDFS-15248
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15248.001.patch, HDFS-15248.002.patch, 
> HDFS-15248.patch
>
>
> For big cluster, the hardcode 32 of ACLs maximum number is not enough, make 
> it configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec

2020-03-31 Thread Dinesh Chitlangia (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072267#comment-17072267
 ] 

Dinesh Chitlangia commented on HDFS-15253:
--

[~kpalanisamy] - Thanks for filing this jira. Yes, restricting the bandwith to 
50mb/s makes sense. When I work with customers who are running HDFS at scale, 
that is the first thing I recommend them.

 

Regarding dfs.image.compress, I have fairly little experience and have not seen 
much benefit with it other than reduced file size.

dfs.namenode.checkpoint.txns can vary based on the cluster usage. So no matter 
what value is set as default, there will always be a large set of users who 
would still have to tune it based on their cluster usage. So I would recommend 
we leave it at default 1M.

> Set default throttle value on dfs.image.transfer.bandwidthPerSec
> 
>
> Key: HDFS-15253
> URL: https://issues.apache.org/jira/browse/HDFS-15253
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>
> The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can 
> use maximum available bandwidth for fsimage transfers during checkpoint. I 
> think we should throttle this. Many users were experienced namenode failover 
> when transferring large image size along with fsimage replication on 
> dfs.namenode.name.dir. eg. >25Gb.  
> Thought to set,
> dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)
> dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
> checkpoint. However, the default checkpoint runs every 6 hours once)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15252) HttpFS : setWorkingDirectory should not accept invalid paths

2020-03-31 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072263#comment-17072263
 ] 

Íñigo Goiri commented on HDFS-15252:


Thanks [~hemanthboyina] for the patch.
A couple comments:
* I don't see a need for {{makeAbsolute()}}, I would just do that directly. If 
you want that, I would make it static.
* In the test, we should use {{LambdaTestUtils#intercept()}}.

> HttpFS : setWorkingDirectory should not accept invalid paths
> 
>
> Key: HDFS-15252
> URL: https://issues.apache.org/jira/browse/HDFS-15252
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15252.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec

2020-03-31 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072244#comment-17072244
 ] 

Wei-Chiu Chuang commented on HDFS-15253:


Not sure about dfs.image.compress. I can count the number of CDH users with 
this on with one hand, and I have little experience with it.

> Set default throttle value on dfs.image.transfer.bandwidthPerSec
> 
>
> Key: HDFS-15253
> URL: https://issues.apache.org/jira/browse/HDFS-15253
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>
> The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can 
> use maximum available bandwidth for fsimage transfers during checkpoint. I 
> think we should throttle this. Many users were experienced namenode failover 
> when transferring large image size along with fsimage replication on 
> dfs.namenode.name.dir. eg. >25Gb.  
> Thought to set,
> dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)
> dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
> checkpoint. However, the default checkpoint runs every 6 hours once)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15201) SnapshotCounter hits MaxSnapshotID limit

2020-03-31 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15201:
---
Fix Version/s: 3.3.0

> SnapshotCounter hits MaxSnapshotID limit
> 
>
> Key: HDFS-15201
> URL: https://issues.apache.org/jira/browse/HDFS-15201
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
> Fix For: 3.3.0
>
>
> Users reported that they are unable to take HDFS snapshots and their 
> snapshotCounter hits MaxSnapshotID limit. MaxSnapshotID limit is 16777215.
> {code:java}
> SnapshotManager.java
> private static final int SNAPSHOT_ID_BIT_WIDTH = 24;
> /**
>  * Returns the maximum allowable snapshot ID based on the bit width of the
>  * snapshot ID.
>  *
>  * @return maximum allowable snapshot ID.
>  */
>  public int getMaxSnapshotID() {
>  return ((1 << SNAPSHOT_ID_BIT_WIDTH) - 1);
> }
> {code}
>  
> I think, SNAPSHOT_ID_BIT_WIDTH is too low. May be good idea to increase 
> SNAPSHOT_ID_BIT_WIDTH to 31? to aline with our CURRENT_STATE_ID limit 
> (Integer.MAX_VALUE - 1).
>  
> {code:java}
> /**
>  * This id is used to indicate the current state (vs. snapshots)
>  */
> public static final int CURRENT_STATE_ID = Integer.MAX_VALUE - 1;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec

2020-03-31 Thread Karthik Palanisamy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072134#comment-17072134
 ] 

Karthik Palanisamy commented on HDFS-15253:
---

What about dfs.image.compress?  Shall we enable fsimage compression default?  
thought pls.

> Set default throttle value on dfs.image.transfer.bandwidthPerSec
> 
>
> Key: HDFS-15253
> URL: https://issues.apache.org/jira/browse/HDFS-15253
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>
> The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can 
> use maximum available bandwidth for fsimage transfers during checkpoint. I 
> think we should throttle this. Many users were experienced namenode failover 
> when transferring large image size along with fsimage replication on 
> dfs.namenode.name.dir. eg. >25Gb.  
> Thought to increase,
> dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)
> dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
> checkpoint. However, the default checkpoint runs every 6 hours once)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec

2020-03-31 Thread Karthik Palanisamy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-15253:
--
Description: 
The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can use 
maximum available bandwidth for fsimage transfers during checkpoint. I think we 
should throttle this. Many users were experienced namenode failover when 
transferring large image size along with fsimage replication on 
dfs.namenode.name.dir. eg. >25Gb.  

Thought to set,

dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)

dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
checkpoint. However, the default checkpoint runs every 6 hours once)

 

  was:
The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can use 
maximum available bandwidth for fsimage transfers during checkpoint. I think we 
should throttle this. Many users were experienced namenode failover when 
transferring large image size along with fsimage replication on 
dfs.namenode.name.dir. eg. >25Gb.  

Thought to increase,

dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)

dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
checkpoint. However, the default checkpoint runs every 6 hours once)

 


> Set default throttle value on dfs.image.transfer.bandwidthPerSec
> 
>
> Key: HDFS-15253
> URL: https://issues.apache.org/jira/browse/HDFS-15253
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>
> The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can 
> use maximum available bandwidth for fsimage transfers during checkpoint. I 
> think we should throttle this. Many users were experienced namenode failover 
> when transferring large image size along with fsimage replication on 
> dfs.namenode.name.dir. eg. >25Gb.  
> Thought to set,
> dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)
> dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
> checkpoint. However, the default checkpoint runs every 6 hours once)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec

2020-03-31 Thread Karthik Palanisamy (Jira)
Karthik Palanisamy created HDFS-15253:
-

 Summary: Set default throttle value on 
dfs.image.transfer.bandwidthPerSec
 Key: HDFS-15253
 URL: https://issues.apache.org/jira/browse/HDFS-15253
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Karthik Palanisamy
Assignee: Karthik Palanisamy


The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can use 
maximum available bandwidth for fsimage transfers during checkpoint. I think we 
should throttle this. Many users were experienced namenode failover when 
transferring large image size along with fsimage replication on 
dfs.namenode.name.dir. eg. >25Gb.  

Thought to increase,

dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)

dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
checkpoint. However, the default checkpoint runs every 6 hours once)

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15235) Transient network failure during NameNode failover kills the NameNode

2020-03-31 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072102#comment-17072102
 ] 

Wei-Chiu Chuang commented on HDFS-15235:


I've seen some variant of this issue where NNs kept bouncing back and forth. 
However, I don't think it was caused by network partitioning, rather, the sbnn 
didn't respond because of JVM GC pause (or something that prevented NN from 
sending response back)

> Transient network failure during NameNode failover kills the NameNode
> -
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Assignee: YCozy
>Priority: Major
> Attachments: HDFS-15235.001.patch
>
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname space=3storage space=24690storage 
> types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
> 2020-03-24 00:24:56,827 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
> Starting CacheReplicationMonitor with interval 3 milliseconds
> {code}
> But NN2 fails to send back the RPC response because of temporary network 
> partitioning.
> {code:java}
> java.io.EOFException: End of File Exception between local host is: 
> "24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
> java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> 

[jira] [Commented] (HDFS-15233) Add -S option in "Count" command to show only Snapshot Counts

2020-03-31 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072096#comment-17072096
 ] 

Hadoop QA commented on HDFS-15233:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
48s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 50s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch 
generated 5 new + 36 unchanged - 0 fixed = 41 total (was 36) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
54s{color} | {color:red} hadoop-common-project_hadoop-common generated 1 new + 
101 unchanged - 0 fixed = 102 total (was 101) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
17s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}102m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:4454c6d14b7 |
| JIRA Issue | HDFS-15233 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12998359/HDFS-15233.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux bed3c1f32d78 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c734d24 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29065/artifact/out/diff-checkstyle-hadoop-common-project_hadoop-common.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29065/artifact/out/diff-javadoc-javadoc-hadoop-common-project_hadoop-common.txt

[jira] [Commented] (HDFS-15252) HttpFS : setWorkingDirectory should not accept invalid paths

2020-03-31 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072095#comment-17072095
 ] 

Hadoop QA commented on HDFS-15252:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
56s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
25s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:4454c6d14b7 |
| JIRA Issue | HDFS-15252 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12998366/HDFS-15252.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f649f1e6922c 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c734d24 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29066/testReport/ |
| Max. process+thread count | 621 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-httpfs U: 
hadoop-hdfs-project/hadoop-hdfs-httpfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29066/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> HttpFS : setWorkingDirectory should not accept invalid paths
> 
>
> Key: HDFS-15252
>  

[jira] [Updated] (HDFS-15252) HttpFS : setWorkingDirectory should not accept invalid paths

2020-03-31 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15252:
-
Attachment: HDFS-15252.001.patch
Status: Patch Available  (was: Open)

> HttpFS : setWorkingDirectory should not accept invalid paths
> 
>
> Key: HDFS-15252
> URL: https://issues.apache.org/jira/browse/HDFS-15252
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15252.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-03-31 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072030#comment-17072030
 ] 

Wei-Chiu Chuang commented on HDFS-15240:


[~marvelrock] could you double check the patch again? there are various 
warnings in the precommit. Thanks.

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
> Attachments: HDFS-15240.001.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> {code:java}
> decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8]
> Check Block(1) first 131072 bytes longest common substring length 4
> Check Block(6) first 131072 bytes longest common substring length 4
> Check Block(8) first 131072 bytes longest common substring length 4
> decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8]
> Check Block(1) first 131072 bytes longest common substring length 65536
> CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length 
> 27197440  # this one
> Check Block(7) first 131072 bytes longest common substring length 4
> Check Block(8) first 131072 bytes longest common substring length 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834) {code}
> Reading from DN may timeout(hold by a future(F)) and output the INFO log, but 
> the futures that contains the future(F)  is cleared, 
> {code:java}
> return new StripingChunkReadResult(futures.remove(future),
> StripingChunkReadResult.CANCELLED); {code}
> futures.remove(future) cause NPE. So the EC reconstruction is failed. In the 
> finally phase, the code snippet in *getStripedReader().close()* 
> {code:java}
> reconstructor.freeBuffer(reader.getReadBuffer());
> reader.freeReadBuffer();
> reader.closeBlockReader(); {code}
> free buffer firstly, but the StripedBlockReader still holds the buffer and 
> write it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HDFS-15252) HttpFS : setWorkingDirectory should not accept invalid paths

2020-03-31 Thread hemanthboyina (Jira)
hemanthboyina created HDFS-15252:


 Summary: HttpFS : setWorkingDirectory should not accept invalid 
paths
 Key: HDFS-15252
 URL: https://issues.apache.org/jira/browse/HDFS-15252
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: hemanthboyina
Assignee: hemanthboyina






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15233) Add -S option in "Count" command to show only Snapshot Counts

2020-03-31 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15233:
-
Attachment: HDFS-15233.001.patch
Status: Patch Available  (was: Open)

> Add -S option in "Count" command to show only Snapshot Counts
> -
>
> Key: HDFS-15233
> URL: https://issues.apache.org/jira/browse/HDFS-15233
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15233.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Jianfei Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071912#comment-17071912
 ] 

Jianfei Jiang commented on HDFS-15251:
--

I found the new zk state when we update zk dependency of our own HBASE to 3.5, 
some commands throw or log the unknown state. meanwhile the dependency of HBASE 
in community has not updated yet. So I check the hdfs code and find this issue. 
I am not sure the the closed state should act like disconnect or expired. So I 
add two patches and i am puzzled. when zkfc will close the connection actively. 
If only the moment of graceful shutdown(i will build 3.3.0-snapshot version to 
make sure if it actually do close at the shutdown) close
the connection,  it may be unnecessary to rejoin a election as its target is 
just to shutdown.

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251.002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Mate Szalay-Beko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071778#comment-17071778
 ] 

Mate Szalay-Beko edited comment on HDFS-15251 at 3/31/20, 1:24 PM:
---

[~jiangjianfei], [~weichiu]
 I guess this logic is needed for triggering a new election to decide who 
should be the active / stand-by name node. I am not very familiar with the HDFS 
code, so can not review that part, but I can give you some background info 
about the CLOSED state.

It was introduced by ZOOKEEPER-2368 . The ZooKeeper watcher gets notified when 
the connection was broken by the ZooKeeper server and the connection state is 
DISCONNECTED in this case. The new behaviour in 3.5.5+ is that a new watcher 
event gets triggered even if the ZooKeeper client was the one closing the 
connection in which case the connection state will be CLOSED.

So (as far as I can tell) it is never possible to get two watcher events when 
the connection is closing. There will be only a single event and the state 
should be either DISCONNECTED or CLOSED. Depending on who initiated the closing 
of the connection. This makes the proposed patch logical. Handling this watcher 
event definitely makes sense (at least to log it).

On the other hand I am not sure what is the expected behaviour in HDFS failover 
controller if HDFS is closing the ZooKeeper connection. When do we call 
ZooKeeper.close() on the connection in the HDFS code? I guess HDFS can do this 
e.g. during some graceful shutdown in the failover controller process. Are we 
sure we want to go to neutral mode and rejoin to election during shutdown? I 
really don't know the background, so I let you to decide.


was (Author: symat):
[~jiangjianfei], [~weichiu]
I guess this logic is needed for triggering a new election to decide who should 
be the active / stand-by name node. I am not very familiar with the HDFS code, 
so can not review that part, but I can give you some background info about the 
CLOSED state.

It was introduced by [ZOOKEEPER-2368 
|https://issues.apache.org/jira/browse/ZOOKEEPER-2368]. The ZooKeeper watcher 
gets notified when the connection was broken by the ZooKeeper server and the 
connection state is DISCONNECTED in this case. The new behaviour in 3.5.5+ is 
that a new watcher event gets triggered even if the ZooKeeper client was the 
one closing the connection in which case the connection state will be CLOSED.

So (as far as I can tell) it is never possible to get two watcher event when 
the connection is closing. There will be only a single event and the state 
should be either DISCONNECTED or CLOSED. Depending on who initiated the closing 
of the connection. This makes the proposed patch logical. Handling this watcher 
event definitely makes sense (at least to log it).

On the other hand I am not sure what is the expected behaviour in HDFS failover 
controller when HDFS is closing the ZooKeeper connection. When do we call 
ZooKeeper.close() on the connection in the HDFS code? I guess HDFS might do 
this during some graceful shutdown in the failover controller process. Are we 
sure we want to go to neutral mode and rejoin to election during shutdown? I 
really don't know the background, so I let you to decide.

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251.002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Mate Szalay-Beko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071778#comment-17071778
 ] 

Mate Szalay-Beko commented on HDFS-15251:
-

[~jiangjianfei], [~weichiu]
I guess this logic is needed for triggering a new election to decide who should 
be the active / stand-by name node. I am not very familiar with the HDFS code, 
so can not review that part, but I can give you some background info about the 
CLOSED state.

It was introduced by [ZOOKEEPER-2368 
|https://issues.apache.org/jira/browse/ZOOKEEPER-2368]. The ZooKeeper watcher 
gets notified when the connection was broken by the ZooKeeper server and the 
connection state is DISCONNECTED in this case. The new behaviour in 3.5.5+ is 
that a new watcher event gets triggered even if the ZooKeeper client was the 
one closing the connection in which case the connection state will be CLOSED.

So (as far as I can tell) it is never possible to get two watcher event when 
the connection is closing. There will be only a single event and the state 
should be either DISCONNECTED or CLOSED. Depending on who initiated the closing 
of the connection. This makes the proposed patch logical. Handling this watcher 
event definitely makes sense (at least to log it).

On the other hand I am not sure what is the expected behaviour in HDFS failover 
controller when HDFS is closing the ZooKeeper connection. When do we call 
ZooKeeper.close() on the connection in the HDFS code? I guess HDFS might do 
this during some graceful shutdown in the failover controller process. Are we 
sure we want to go to neutral mode and rejoin to election during shutdown? I 
really don't know the background, so I let you to decide.

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251.002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15235) Transient network failure during NameNode failover kills the NameNode

2020-03-31 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071727#comment-17071727
 ] 

YCozy commented on HDFS-15235:
--

Hello [~weichiu], would you please help review the patch? Thanks!

> Transient network failure during NameNode failover kills the NameNode
> -
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Assignee: YCozy
>Priority: Major
> Attachments: HDFS-15235.001.patch
>
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname space=3storage space=24690storage 
> types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
> 2020-03-24 00:24:56,827 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
> Starting CacheReplicationMonitor with interval 3 milliseconds
> {code}
> But NN2 fails to send back the RPC response because of temporary network 
> partitioning.
> {code:java}
> java.io.EOFException: End of File Exception between local host is: 
> "24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
> java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at 

[jira] [Commented] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071680#comment-17071680
 ] 

Hadoop QA commented on HDFS-15251:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
1s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 25m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
23m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 20m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
56s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}136m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:4454c6d14b7 |
| JIRA Issue | HDFS-15251 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12998311/HDFS-15251.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 98700e324774 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 80b877a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29064/testReport/ |
| Max. process+thread count | 1486 (vs. ulimit of 5500) |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29064/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
>  

[jira] [Commented] (HDFS-15248) Make the maximum number of ACLs entries configurable

2020-03-31 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071650#comment-17071650
 ] 

Stephen O'Donnell commented on HDFS-15248:
--

This has come up before in HDFS-7447 and we opted not to make this configurable.

While your cluster will use this feature sparingly, my fear is that if we open 
it up to be configurable some people will abuse the feature and create problems 
for themselves, and for those of us who support the clusters.

Reading the comment in HDFS-7447, it seems that other ext based file systems 
have a 32 ACL limit too, so in that respect we are consisted with them.

The usual answer to this sort of problem is that either:

1) Use Sentry a or Ranger type plugin

2) Rather than adding lots of users to an ACL, create a group for access to the 
directory and assign that group to the users at the unix group / Active 
Directory level which is probably more flexible and less error prone.

> Make the maximum number of ACLs entries configurable
> 
>
> Key: HDFS-15248
> URL: https://issues.apache.org/jira/browse/HDFS-15248
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15248.001.patch, HDFS-15248.002.patch, 
> HDFS-15248.patch
>
>
> For big cluster, the hardcode 32 of ACLs maximum number is not enough, make 
> it configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Jianfei Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-15251:
-
Status: In Progress  (was: Patch Available)

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251.002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Jianfei Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-15251:
-
Status: Patch Available  (was: In Progress)

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251.002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Jianfei Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-15251:
-
Attachment: (was: HDFS-15251_002.patch)

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251.002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Jianfei Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-15251:
-
Attachment: HDFS-15251.002.patch

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251.002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Jianfei Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-15251:
-
Status: Patch Available  (was: In Progress)

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251_002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Jianfei Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-15251:
-
Status: In Progress  (was: Patch Available)

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251_002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Jianfei Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-15251:
-
Attachment: HDFS-15251_002.patch

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251_002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Jianfei Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-15251:
-
Comment: was deleted

(was: oops, it may be a issue with prefix of hadoop but not hdfs.)

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15247) RBF: Provide Non DFS Used per DataNode in DataNode UI

2020-03-31 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071500#comment-17071500
 ] 

Hadoop QA commented on HDFS-15247:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
48s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
33m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m  8s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:4454c6d14b7 |
| JIRA Issue | HDFS-15247 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12998287/HDFS-15247.001.patch |
| Optional Tests |  dupname  asflicense  shadedclient  |
| uname | Linux e054f950746c 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 80b877a |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 316 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29063/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: Provide Non DFS Used per DataNode in DataNode UI
> -
>
> Key: HDFS-15247
> URL: https://issues.apache.org/jira/browse/HDFS-15247
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15247.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Jianfei Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071486#comment-17071486
 ] 

Jianfei Jiang commented on HDFS-15251:
--

oops, it may be a issue with prefix of hadoop but not hdfs.

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15247) RBF: Provide Non DFS Used per DataNode in DataNode UI

2020-03-31 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-15247:
---
Attachment: HDFS-15247.001.patch
Status: Patch Available  (was: Open)

> RBF: Provide Non DFS Used per DataNode in DataNode UI
> -
>
> Key: HDFS-15247
> URL: https://issues.apache.org/jira/browse/HDFS-15247
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15247.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org