[jira] [Updated] (HDFS-12130) Optimizing permission check for getContentSummary

2017-07-14 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-12130:
---
   Resolution: Fixed
Fix Version/s: 3.0.0-beta1
   2.9.0
   Status: Resolved  (was: Patch Available)

I have committed this.  Thanks, Chen!

> Optimizing permission check for getContentSummary
> -
>
> Key: HDFS-12130
> URL: https://issues.apache.org/jira/browse/HDFS-12130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-12130.001.patch, HDFS-12130.002.patch, 
> HDFS-12130.003.patch
>
>
> Currently, {{getContentSummary}} takes two phases to complete:
> - phase1. check the permission of the entire subtree. If any subdirectory 
> does not have {{READ_EXECUTE}}, an access control exception is thrown and 
> {{getContentSummary}} terminates here (unless it's super user).
> - phase2. If phase1 passed, it will then traverse the entire tree recursively 
> to get the actual content summary.
> An issue is, both phases currently hold the fs lock.
> Phase 2 has already been written that, it will yield the fs lock over time, 
> such that it does not block other operations for too long. However phase 1 
> does not yield. Meaning it's possible that the permission check phase still 
> blocks things for long time.
> One fix is to add lock yield to phase 1. But a simpler fix is to merge phase 
> 1 into phase 2. Namely, instead of doing a full traversal for permission 
> check first, we start with phase 2 directly, but for each directory, before 
> obtaining its summary, check its permission first. This way we take advantage 
> of existing lock yield in phase 2 code and still able to check permission and 
> terminate on access exception.
> Thanks [~szetszwo] for the offline discussions!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12130) Optimizing permission check for getContentSummary

2017-07-13 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-12130:
---
Hadoop Flags: Reviewed

+1 the 003 patch looks good.

> Optimizing permission check for getContentSummary
> -
>
> Key: HDFS-12130
> URL: https://issues.apache.org/jira/browse/HDFS-12130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-12130.001.patch, HDFS-12130.002.patch, 
> HDFS-12130.003.patch
>
>
> Currently, {{getContentSummary}} takes two phases to complete:
> - phase1. check the permission of the entire subtree. If any subdirectory 
> does not have {{READ_EXECUTE}}, an access control exception is thrown and 
> {{getContentSummary}} terminates here (unless it's super user).
> - phase2. If phase1 passed, it will then traverse the entire tree recursively 
> to get the actual content summary.
> An issue is, both phases currently hold the fs lock.
> Phase 2 has already been written that, it will yield the fs lock over time, 
> such that it does not block other operations for too long. However phase 1 
> does not yield. Meaning it's possible that the permission check phase still 
> blocks things for long time.
> One fix is to add lock yield to phase 1. But a simpler fix is to merge phase 
> 1 into phase 2. Namely, instead of doing a full traversal for permission 
> check first, we start with phase 2 directly, but for each directory, before 
> obtaining its summary, check its permission first. This way we take advantage 
> of existing lock yield in phase 2 code and still able to check permission and 
> terminate on access exception.
> Thanks [~szetszwo] for the offline discussions!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12130) Optimizing permission check for getContentSummary

2017-07-13 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-12130:
--
Attachment: HDFS-12130.003.patch

Post v003 patch. Thanks [~szetszwo] for the review and the offline discussions!!

> Optimizing permission check for getContentSummary
> -
>
> Key: HDFS-12130
> URL: https://issues.apache.org/jira/browse/HDFS-12130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-12130.001.patch, HDFS-12130.002.patch, 
> HDFS-12130.003.patch
>
>
> Currently, {{getContentSummary}} takes two phases to complete:
> - phase1. check the permission of the entire subtree. If any subdirectory 
> does not have {{READ_EXECUTE}}, an access control exception is thrown and 
> {{getContentSummary}} terminates here (unless it's super user).
> - phase2. If phase1 passed, it will then traverse the entire tree recursively 
> to get the actual content summary.
> An issue is, both phases currently hold the fs lock.
> Phase 2 has already been written that, it will yield the fs lock over time, 
> such that it does not block other operations for too long. However phase 1 
> does not yield. Meaning it's possible that the permission check phase still 
> blocks things for long time.
> One fix is to add lock yield to phase 1. But a simpler fix is to merge phase 
> 1 into phase 2. Namely, instead of doing a full traversal for permission 
> check first, we start with phase 2 directly, but for each directory, before 
> obtaining its summary, check its permission first. This way we take advantage 
> of existing lock yield in phase 2 code and still able to check permission and 
> terminate on access exception.
> Thanks [~szetszwo] for the offline discussions!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12130) Optimizing permission check for getContentSummary

2017-07-13 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-12130:
--
Attachment: HDFS-12130.002.patch

Post v002 patch to fix checkstyle warnings and asf license. The failed are 
unrelated except for {{TestDFSShell}}, also fixed in v002 patch.

> Optimizing permission check for getContentSummary
> -
>
> Key: HDFS-12130
> URL: https://issues.apache.org/jira/browse/HDFS-12130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-12130.001.patch, HDFS-12130.002.patch
>
>
> Currently, {{getContentSummary}} takes two phases to complete:
> - phase1. check the permission of the entire subtree. If any subdirectory 
> does not have {{READ_EXECUTE}}, an access control exception is thrown and 
> {{getContentSummary}} terminates here (unless it's super user).
> - phase2. If phase1 passed, it will then traverse the entire tree recursively 
> to get the actual content summary.
> An issue is, both phases currently hold the fs lock.
> Phase 2 has already been written that, it will yield the fs lock over time, 
> such that it does not block other operations for too long. However phase 1 
> does not yield. Meaning it's possible that the permission check phase still 
> blocks things for long time.
> One fix is to add lock yield to phase 1. But a simpler fix is to merge phase 
> 1 into phase 2. Namely, instead of doing a full traversal for permission 
> check first, we start with phase 2 directly, but for each directory, before 
> obtaining its summary, check its permission first. This way we take advantage 
> of existing lock yield in phase 2 code and still able to check permission and 
> terminate on access exception.
> Thanks [~szetszwo] for the offline discussions!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12130) Optimizing permission check for getContentSummary

2017-07-12 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-12130:
--
Attachment: HDFS-12130.001.patch

Post initial patch

> Optimizing permission check for getContentSummary
> -
>
> Key: HDFS-12130
> URL: https://issues.apache.org/jira/browse/HDFS-12130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-12130.001.patch
>
>
> Currently, {{getContentSummary}} takes two phases to complete:
> - phase1. check the permission of the entire subtree. If any subdirectory 
> does not have {{READ_EXECUTE}}, an access control exception is thrown and 
> {{getContentSummary}} terminates here (unless it's super user).
> - phase2. If phase1 passed, it will then traverse the entire tree recursively 
> to get the actual content summary.
> An issue is, both phases currently hold the fs lock.
> Phase 2 has already been written that, it will yield the fs lock over time, 
> such that it does not block other operations for too long. However phase 1 
> does not yield. Meaning it's possible that the permission check phase still 
> blocks things for long time.
> One fix is to add lock yield to phase 1. But a simpler fix is to merge phase 
> 1 into phase 2. Namely, instead of doing a full traversal for permission 
> check first, we start with phase 2 directly, but for each directory, before 
> obtaining its summary, check its permission first. This way we take advantage 
> of existing lock yield in phase 2 code and still able to check permission and 
> terminate on access exception.
> Thanks [~szetszwo] for the offline discussions!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12130) Optimizing permission check for getContentSummary

2017-07-12 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-12130:
--
Status: Patch Available  (was: Open)

> Optimizing permission check for getContentSummary
> -
>
> Key: HDFS-12130
> URL: https://issues.apache.org/jira/browse/HDFS-12130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-12130.001.patch
>
>
> Currently, {{getContentSummary}} takes two phases to complete:
> - phase1. check the permission of the entire subtree. If any subdirectory 
> does not have {{READ_EXECUTE}}, an access control exception is thrown and 
> {{getContentSummary}} terminates here (unless it's super user).
> - phase2. If phase1 passed, it will then traverse the entire tree recursively 
> to get the actual content summary.
> An issue is, both phases currently hold the fs lock.
> Phase 2 has already been written that, it will yield the fs lock over time, 
> such that it does not block other operations for too long. However phase 1 
> does not yield. Meaning it's possible that the permission check phase still 
> blocks things for long time.
> One fix is to add lock yield to phase 1. But a simpler fix is to merge phase 
> 1 into phase 2. Namely, instead of doing a full traversal for permission 
> check first, we start with phase 2 directly, but for each directory, before 
> obtaining its summary, check its permission first. This way we take advantage 
> of existing lock yield in phase 2 code and still able to check permission and 
> terminate on access exception.
> Thanks [~szetszwo] for the offline discussions!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org