[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883032#comment-15883032
 ] 

Hudson commented on HADOOP-13817:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11301 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11301/])
HADOOP-13817. Add a finite shell command timeout to (harsh: rev 
e8694deb6ad180449f8ce6c1c8b4f84873c0587a)
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
* (edit) hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedUnixGroupsMapping.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch
>
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883003#comment-15883003
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user QwertyManiac commented on the issue:

https://github.com/apache/hadoop/pull/161
  
Done via e8694deb6ad180449f8ce6c1c8b4f84873c0587a.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch
>
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883004#comment-15883004
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user QwertyManiac closed the pull request at:

https://github.com/apache/hadoop/pull/161


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch
>
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882938#comment-15882938
 ] 

Wei-Chiu Chuang commented on HADOOP-13817:
--

Test failures are not reproducible in my local tree.

+1 based on my review and [~xyao]'s review on GitHub.

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch
>
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882822#comment-15882822
 ] 

Hadoop QA commented on HADOOP-13817:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 11s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestKDiag |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13817 |
| GITHUB PR | https://github.com/apache/hadoop/pull/161 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 93f53f4231a1 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e60c654 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11709/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11709/testReport/ |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11709/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue 

[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882705#comment-15882705
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user QwertyManiac commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r102943834
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java
 ---
@@ -133,8 +177,26 @@ protected ShellCommandExecutor 
createGroupIDExecutor(String userName) {
 groups = resolvePartialGroupNames(user, e.getMessage(),
 executor.getOutput());
   } catch (PartialGroupNameException pge) {
-LOG.warn("unable to return groups for user " + user, pge);
-return new LinkedList<>();
+LOG.warn("unable to return groups for user {}", user, pge);
+return EMPTY_GROUPS;
+  }
+} catch (IOException ioe) {
+  // If its a shell executor timeout, indicate so in the message
+  // but treat the result as empty instead of throwing it up,
+  // similar to how partial resolution failures are handled above
+  if (executor.isTimedOut()) {
+LOG.warn(
+"Unable to return groups for user '{}' as shell group lookup " 
+
+"command '{}' ran longer than the configured timeout limit of 
" +
+"{} seconds.",
+user,
+Arrays.asList(executor.getExecString()),
--- End diff --

Thank you again @jojochuang. I've added a new change here addressing your 
comments. I've also uploaded the full patch directly to JIRA to trigger a build 
test.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch
>
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880143#comment-15880143
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user jojochuang commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r102665699
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java
 ---
@@ -133,8 +177,26 @@ protected ShellCommandExecutor 
createGroupIDExecutor(String userName) {
 groups = resolvePartialGroupNames(user, e.getMessage(),
 executor.getOutput());
   } catch (PartialGroupNameException pge) {
-LOG.warn("unable to return groups for user " + user, pge);
-return new LinkedList<>();
+LOG.warn("unable to return groups for user {}", user, pge);
+return EMPTY_GROUPS;
+  }
+} catch (IOException ioe) {
+  // If its a shell executor timeout, indicate so in the message
+  // but treat the result as empty instead of throwing it up,
+  // similar to how partial resolution failures are handled above
+  if (executor.isTimedOut()) {
+LOG.warn(
+"Unable to return groups for user '{}' as shell group lookup " 
+
+"command '{}' ran longer than the configured timeout limit of 
" +
+"{} seconds.",
+user,
+Arrays.asList(executor.getExecString()),
--- End diff --

I am +1 pending this and Jenkins precommit build. Somehow Jenkins is never 
run for your patches.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880135#comment-15880135
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user jojochuang commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r102665241
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java
 ---
@@ -133,8 +177,26 @@ protected ShellCommandExecutor 
createGroupIDExecutor(String userName) {
 groups = resolvePartialGroupNames(user, e.getMessage(),
 executor.getOutput());
   } catch (PartialGroupNameException pge) {
-LOG.warn("unable to return groups for user " + user, pge);
-return new LinkedList<>();
+LOG.warn("unable to return groups for user {}", user, pge);
+return EMPTY_GROUPS;
+  }
+} catch (IOException ioe) {
+  // If its a shell executor timeout, indicate so in the message
+  // but treat the result as empty instead of throwing it up,
+  // similar to how partial resolution failures are handled above
+  if (executor.isTimedOut()) {
+LOG.warn(
+"Unable to return groups for user '{}' as shell group lookup " 
+
+"command '{}' ran longer than the configured timeout limit of 
" +
+"{} seconds.",
+user,
+Arrays.asList(executor.getExecString()),
--- End diff --

Reviewed the patch again, and I think it's almost ready.
Here's one nit regarding the warning message here: it would print something 
like "Unable to return groups for user 'foobarnonexistinguser' as shell group 
lookup command '[sleep, 2]'". But this can be confusing as the command is not 
run with brackets and comas.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871062#comment-15871062
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user QwertyManiac commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r101673820
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java
 ---
@@ -225,8 +287,16 @@ protected ShellCommandExecutor 
createGroupIDExecutor(String userName) {
 throw new PartialGroupNameException("failed to get group id list 
for " +
 "user '" + userName + "'", ece);
   } catch (IOException ioe) {
-throw new PartialGroupNameException("can't execute the shell 
command to"
-+ " get the list of group id for user '" + userName + "'", ioe);
+String message =
+"Can't execute the shell command to " +
+"get the list of group id for user '" + userName + "'";
+if (exec2.isTimedOut()) {
+  message +=
+  " because of the command taking longer than " +
+  "the configured timeout: " + timeout + " seconds";
+  throw new PartialGroupNameException(message);
--- End diff --

Thanks, addressed in new commit. I just felt the timeout exception may look 
weird, but I've dropped the line so we can be consistent in exposing the 
exception at all times.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871061#comment-15871061
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user QwertyManiac commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r101673744
  
--- Diff: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedUnixGroupsMapping.java
 ---
@@ -22,19 +22,32 @@
 
 import org.apache.commons.logging.Log;
 import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.CommonConfigurationKeys;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.apache.hadoop.util.ReflectionUtils;
+import org.apache.hadoop.util.Shell;
 import org.apache.hadoop.util.Shell.ExitCodeException;
 import org.apache.hadoop.util.Shell.ShellCommandExecutor;
 import org.junit.Test;
+
 import static org.junit.Assert.*;
 import static org.mockito.Mockito.doNothing;
 import static org.mockito.Mockito.doThrow;
 import static org.mockito.Mockito.mock;
 import static org.mockito.Mockito.when;
 
 public class TestShellBasedUnixGroupsMapping {
-  private static final Log LOG =
+  private static final Log TESTLOG =
   LogFactory.getLog(TestShellBasedUnixGroupsMapping.class);
 
+  private final GenericTestUtils.LogCapturer shellMappingLog =
+  GenericTestUtils.LogCapturer.captureLogs(
+  ShellBasedUnixGroupsMapping.LOG);
+
+  private static final boolean WINDOWS =
+  (Shell.osType == Shell.OSType.OS_TYPE_WIN);
--- End diff --

Thank you, addressed in new commit.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870471#comment-15870471
 ] 

Wei-Chiu Chuang commented on HADOOP-13817:
--

Okay I did a quick review and there were only two nits after Xiaoyu's numerous 
reviews.
By the way, Yetus does not support precommit check if the patch is a github 
pull request, right? I am not sure how to make that happen... Maybe attach a 
patch into this jira would trigger that.

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870468#comment-15870468
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user jojochuang commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r101592507
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java
 ---
@@ -225,8 +287,16 @@ protected ShellCommandExecutor 
createGroupIDExecutor(String userName) {
 throw new PartialGroupNameException("failed to get group id list 
for " +
 "user '" + userName + "'", ece);
   } catch (IOException ioe) {
-throw new PartialGroupNameException("can't execute the shell 
command to"
-+ " get the list of group id for user '" + userName + "'", ioe);
+String message =
+"Can't execute the shell command to " +
+"get the list of group id for user '" + userName + "'";
+if (exec2.isTimedOut()) {
+  message +=
+  " because of the command taking longer than " +
+  "the configured timeout: " + timeout + " seconds";
+  throw new PartialGroupNameException(message);
--- End diff --

Maybe this line is not needed if it will throw the exception anyway? The 
only difference is that it will not carry the original exception


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870464#comment-15870464
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user jojochuang commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r101592163
  
--- Diff: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedUnixGroupsMapping.java
 ---
@@ -22,19 +22,32 @@
 
 import org.apache.commons.logging.Log;
 import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.CommonConfigurationKeys;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.apache.hadoop.util.ReflectionUtils;
+import org.apache.hadoop.util.Shell;
 import org.apache.hadoop.util.Shell.ExitCodeException;
 import org.apache.hadoop.util.Shell.ShellCommandExecutor;
 import org.junit.Test;
+
 import static org.junit.Assert.*;
 import static org.mockito.Mockito.doNothing;
 import static org.mockito.Mockito.doThrow;
 import static org.mockito.Mockito.mock;
 import static org.mockito.Mockito.when;
 
 public class TestShellBasedUnixGroupsMapping {
-  private static final Log LOG =
+  private static final Log TESTLOG =
   LogFactory.getLog(TestShellBasedUnixGroupsMapping.class);
 
+  private final GenericTestUtils.LogCapturer shellMappingLog =
+  GenericTestUtils.LogCapturer.captureLogs(
+  ShellBasedUnixGroupsMapping.LOG);
+
+  private static final boolean WINDOWS =
+  (Shell.osType == Shell.OSType.OS_TYPE_WIN);
--- End diff --

I think you can just use Shell.WINDOWS instead.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870233#comment-15870233
 ] 

Wei-Chiu Chuang commented on HADOOP-13817:
--

Hey [~qwertymaniac] this is a big patch, but sure I'll review it today.

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-16 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870058#comment-15870058
 ] 

Harsh J commented on HADOOP-13817:
--

[~jojochuang] - Could you help in reviewing this? Current diff is at 
https://github.com/apache/hadoop/pull/161

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-24 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694982#comment-15694982
 ] 

Harsh J commented on HADOOP-13817:
--

[~xyao] - Thanks for the reviews so far. Could you take a look at the current 
patch-set?

[~yzhangal] - Could you help take a look at this one?

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15672263#comment-15672263
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user QwertyManiac commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r88366053
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java
 ---
@@ -51,14 +52,16 @@
   LoggerFactory.getLogger(ShellBasedUnixGroupsMapping.class);
 
   private long timeout = 0L;
+  private final List emptyGroupsList = new LinkedList<>();
--- End diff --

Done in the latest commit.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15672262#comment-15672262
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user QwertyManiac commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r88366026
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java
 ---
@@ -517,15 +517,15 @@
* 
* core-default.xml
*/
-  public static final String HADOOP_SECURITY_GROUP_SHELL_COMMAND_TIMEOUT =
-  "hadoop.security.groups.shell.groups.command.timeout";
+  public static final String 
HADOOP_SECURITY_GROUP_SHELL_COMMAND_TIMEOUT_SECS =
+  "hadoop.security.groups.shell.command.timeout.secs";
--- End diff --

I'd just felt it would be clearer to have '.secs' in it indicating the 
minimum unit type despite the parser. I've removed that in the latest commit 
and added some examples into its doc-text in the xml.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667914#comment-15667914
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user xiaoyuyao commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r88084092
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java
 ---
@@ -51,14 +52,16 @@
   LoggerFactory.getLogger(ShellBasedUnixGroupsMapping.class);
 
   private long timeout = 0L;
+  private final List emptyGroupsList = new LinkedList<>();
--- End diff --

This can be static.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667901#comment-15667901
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user xiaoyuyao commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r88082752
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java
 ---
@@ -517,15 +517,15 @@
* 
* core-default.xml
*/
-  public static final String HADOOP_SECURITY_GROUP_SHELL_COMMAND_TIMEOUT =
-  "hadoop.security.groups.shell.groups.command.timeout";
+  public static final String 
HADOOP_SECURITY_GROUP_SHELL_COMMAND_TIMEOUT_SECS =
+  "hadoop.security.groups.shell.command.timeout.secs";
--- End diff --

I think we should keep 
"hadoop.security.groups.shell.groups.command.timeout". Sorry I was not clear 
about this when suggesting Configuration#getTimeDuration. It allows admin to 
specify the time values with suffix like 10s, 1m, 2h. So the ."secs" won't be 
needed here.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667072#comment-15667072
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user QwertyManiac commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r88010241
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java
 ---
@@ -133,8 +171,26 @@ protected ShellCommandExecutor 
createGroupIDExecutor(String userName) {
 groups = resolvePartialGroupNames(user, e.getMessage(),
 executor.getOutput());
   } catch (PartialGroupNameException pge) {
-LOG.warn("unable to return groups for user " + user, pge);
+LOG.warn("unable to return groups for user {}", user, pge);
+return new LinkedList<>();
--- End diff --

Done in added commit.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667071#comment-15667071
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user QwertyManiac commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r88010210
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java
 ---
@@ -37,11 +43,23 @@
  */
 @InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"})
 @InterfaceStability.Evolving
-public class ShellBasedUnixGroupsMapping
+public class ShellBasedUnixGroupsMapping extends Configured
   implements GroupMappingServiceProvider {
-  
-  private static final Log LOG =
-LogFactory.getLog(ShellBasedUnixGroupsMapping.class);
+
+  @VisibleForTesting
+  protected static final Logger LOG =
+  LoggerFactory.getLogger(ShellBasedUnixGroupsMapping.class);
+
+  private long timeout = 0L;
+
+  @Override
+  public void setConf(Configuration conf) {
+super.setConf(conf);
+timeout = conf.getLong(
--- End diff --

Done in added commit.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664544#comment-15664544
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user xiaoyuyao commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r87855020
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java
 ---
@@ -133,8 +171,26 @@ protected ShellCommandExecutor 
createGroupIDExecutor(String userName) {
 groups = resolvePartialGroupNames(user, e.getMessage(),
 executor.getOutput());
   } catch (PartialGroupNameException pge) {
-LOG.warn("unable to return groups for user " + user, pge);
+LOG.warn("unable to return groups for user {}", user, pge);
+return new LinkedList<>();
+  }
+} catch (IOException ioe) {
+  // If its a shell executor timeout, indicate so in the message
+  // but treat the result as empty instead of throwing it up,
+  // similar to how partial resolution failures are handled above
+  if (executor.isTimedOut()) {
+LOG.warn(
+"Unable to return groups for user '{}' as shell group lookup " 
+
+"command '{}' ran longer than the configured timeout limit of 
" +
+"{} seconds.",
+user,
+Arrays.asList(executor.getExecString()),
+timeout
+);
 return new LinkedList<>();
--- End diff --

flyweight pattern applies here as well.


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664541#comment-15664541
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

Github user xiaoyuyao commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/161#discussion_r87854835
  
--- Diff: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java
 ---
@@ -133,8 +171,26 @@ protected ShellCommandExecutor 
createGroupIDExecutor(String userName) {
 groups = resolvePartialGroupNames(user, e.getMessage(),
 executor.getOutput());
   } catch (PartialGroupNameException pge) {
-LOG.warn("unable to return groups for user " + user, pge);
+LOG.warn("unable to return groups for user {}", user, pge);
+return new LinkedList<>();
--- End diff --

Can we use flyweight pattern to minimize memory usage for the empty 
LinkList?


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663747#comment-15663747
 ] 

ASF GitHub Bot commented on HADOOP-13817:
-

GitHub user QwertyManiac opened a pull request:

https://github.com/apache/hadoop/pull/161

HADOOP-13817. Add a finite shell command timeout to 
ShellBasedUnixGroupsMapping.

- Tests required log capture so the LOG variable was elevated in 
visibility, which required changes in a few test methods unrelated to just this 
change.
- Timeout is applied to both, the regular groups and the partial group 
listing check commands.
- Default is left to 0 to not break current behaviour (i.e. to wait forever 
until groups are resolvable).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QwertyManiac/hadoop HADOOP-13817

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/161.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #161


commit 6707438a05d46ac57b230d4e9ade59a1a935c9c3
Author: Harsh J 
Date:   2016-11-14T10:29:58Z

HADOOP-13817. Add a finite shell command timeout to 
ShellBasedUnixGroupsMapping.




> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org