[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883032#comment-15883032 ] Hudson commented on HADOOP-13817: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11301 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11301/]) HADOOP-13817. Add a finite shell command timeout to (harsh: rev e8694deb6ad180449f8ce6c1c8b4f84873c0587a) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java * (edit) hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedUnixGroupsMapping.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch > > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883003#comment-15883003 ] ASF GitHub Bot commented on HADOOP-13817: - Github user QwertyManiac commented on the issue: https://github.com/apache/hadoop/pull/161 Done via e8694deb6ad180449f8ce6c1c8b4f84873c0587a. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch > > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883004#comment-15883004 ] ASF GitHub Bot commented on HADOOP-13817: - Github user QwertyManiac closed the pull request at: https://github.com/apache/hadoop/pull/161 > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch > > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882938#comment-15882938 ] Wei-Chiu Chuang commented on HADOOP-13817: -- Test failures are not reproducible in my local tree. +1 based on my review and [~xyao]'s review on GitHub. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch > > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882822#comment-15882822 ] Hadoop QA commented on HADOOP-13817: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 11s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 55m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.TestKDiag | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | HADOOP-13817 | | GITHUB PR | https://github.com/apache/hadoop/pull/161 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux 93f53f4231a1 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e60c654 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HADOOP-Build/11709/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/11709/testReport/ | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/11709/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882705#comment-15882705 ] ASF GitHub Bot commented on HADOOP-13817: - Github user QwertyManiac commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r102943834 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -133,8 +177,26 @@ protected ShellCommandExecutor createGroupIDExecutor(String userName) { groups = resolvePartialGroupNames(user, e.getMessage(), executor.getOutput()); } catch (PartialGroupNameException pge) { -LOG.warn("unable to return groups for user " + user, pge); -return new LinkedList<>(); +LOG.warn("unable to return groups for user {}", user, pge); +return EMPTY_GROUPS; + } +} catch (IOException ioe) { + // If its a shell executor timeout, indicate so in the message + // but treat the result as empty instead of throwing it up, + // similar to how partial resolution failures are handled above + if (executor.isTimedOut()) { +LOG.warn( +"Unable to return groups for user '{}' as shell group lookup " + +"command '{}' ran longer than the configured timeout limit of " + +"{} seconds.", +user, +Arrays.asList(executor.getExecString()), --- End diff -- Thank you again @jojochuang. I've added a new change here addressing your comments. I've also uploaded the full patch directly to JIRA to trigger a build test. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch > > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880143#comment-15880143 ] ASF GitHub Bot commented on HADOOP-13817: - Github user jojochuang commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r102665699 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -133,8 +177,26 @@ protected ShellCommandExecutor createGroupIDExecutor(String userName) { groups = resolvePartialGroupNames(user, e.getMessage(), executor.getOutput()); } catch (PartialGroupNameException pge) { -LOG.warn("unable to return groups for user " + user, pge); -return new LinkedList<>(); +LOG.warn("unable to return groups for user {}", user, pge); +return EMPTY_GROUPS; + } +} catch (IOException ioe) { + // If its a shell executor timeout, indicate so in the message + // but treat the result as empty instead of throwing it up, + // similar to how partial resolution failures are handled above + if (executor.isTimedOut()) { +LOG.warn( +"Unable to return groups for user '{}' as shell group lookup " + +"command '{}' ran longer than the configured timeout limit of " + +"{} seconds.", +user, +Arrays.asList(executor.getExecString()), --- End diff -- I am +1 pending this and Jenkins precommit build. Somehow Jenkins is never run for your patches. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880135#comment-15880135 ] ASF GitHub Bot commented on HADOOP-13817: - Github user jojochuang commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r102665241 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -133,8 +177,26 @@ protected ShellCommandExecutor createGroupIDExecutor(String userName) { groups = resolvePartialGroupNames(user, e.getMessage(), executor.getOutput()); } catch (PartialGroupNameException pge) { -LOG.warn("unable to return groups for user " + user, pge); -return new LinkedList<>(); +LOG.warn("unable to return groups for user {}", user, pge); +return EMPTY_GROUPS; + } +} catch (IOException ioe) { + // If its a shell executor timeout, indicate so in the message + // but treat the result as empty instead of throwing it up, + // similar to how partial resolution failures are handled above + if (executor.isTimedOut()) { +LOG.warn( +"Unable to return groups for user '{}' as shell group lookup " + +"command '{}' ran longer than the configured timeout limit of " + +"{} seconds.", +user, +Arrays.asList(executor.getExecString()), --- End diff -- Reviewed the patch again, and I think it's almost ready. Here's one nit regarding the warning message here: it would print something like "Unable to return groups for user 'foobarnonexistinguser' as shell group lookup command '[sleep, 2]'". But this can be confusing as the command is not run with brackets and comas. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871062#comment-15871062 ] ASF GitHub Bot commented on HADOOP-13817: - Github user QwertyManiac commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r101673820 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -225,8 +287,16 @@ protected ShellCommandExecutor createGroupIDExecutor(String userName) { throw new PartialGroupNameException("failed to get group id list for " + "user '" + userName + "'", ece); } catch (IOException ioe) { -throw new PartialGroupNameException("can't execute the shell command to" -+ " get the list of group id for user '" + userName + "'", ioe); +String message = +"Can't execute the shell command to " + +"get the list of group id for user '" + userName + "'"; +if (exec2.isTimedOut()) { + message += + " because of the command taking longer than " + + "the configured timeout: " + timeout + " seconds"; + throw new PartialGroupNameException(message); --- End diff -- Thanks, addressed in new commit. I just felt the timeout exception may look weird, but I've dropped the line so we can be consistent in exposing the exception at all times. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871061#comment-15871061 ] ASF GitHub Bot commented on HADOOP-13817: - Github user QwertyManiac commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r101673744 --- Diff: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedUnixGroupsMapping.java --- @@ -22,19 +22,32 @@ import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.CommonConfigurationKeys; +import org.apache.hadoop.test.GenericTestUtils; +import org.apache.hadoop.util.ReflectionUtils; +import org.apache.hadoop.util.Shell; import org.apache.hadoop.util.Shell.ExitCodeException; import org.apache.hadoop.util.Shell.ShellCommandExecutor; import org.junit.Test; + import static org.junit.Assert.*; import static org.mockito.Mockito.doNothing; import static org.mockito.Mockito.doThrow; import static org.mockito.Mockito.mock; import static org.mockito.Mockito.when; public class TestShellBasedUnixGroupsMapping { - private static final Log LOG = + private static final Log TESTLOG = LogFactory.getLog(TestShellBasedUnixGroupsMapping.class); + private final GenericTestUtils.LogCapturer shellMappingLog = + GenericTestUtils.LogCapturer.captureLogs( + ShellBasedUnixGroupsMapping.LOG); + + private static final boolean WINDOWS = + (Shell.osType == Shell.OSType.OS_TYPE_WIN); --- End diff -- Thank you, addressed in new commit. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870471#comment-15870471 ] Wei-Chiu Chuang commented on HADOOP-13817: -- Okay I did a quick review and there were only two nits after Xiaoyu's numerous reviews. By the way, Yetus does not support precommit check if the patch is a github pull request, right? I am not sure how to make that happen... Maybe attach a patch into this jira would trigger that. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870468#comment-15870468 ] ASF GitHub Bot commented on HADOOP-13817: - Github user jojochuang commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r101592507 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -225,8 +287,16 @@ protected ShellCommandExecutor createGroupIDExecutor(String userName) { throw new PartialGroupNameException("failed to get group id list for " + "user '" + userName + "'", ece); } catch (IOException ioe) { -throw new PartialGroupNameException("can't execute the shell command to" -+ " get the list of group id for user '" + userName + "'", ioe); +String message = +"Can't execute the shell command to " + +"get the list of group id for user '" + userName + "'"; +if (exec2.isTimedOut()) { + message += + " because of the command taking longer than " + + "the configured timeout: " + timeout + " seconds"; + throw new PartialGroupNameException(message); --- End diff -- Maybe this line is not needed if it will throw the exception anyway? The only difference is that it will not carry the original exception > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870464#comment-15870464 ] ASF GitHub Bot commented on HADOOP-13817: - Github user jojochuang commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r101592163 --- Diff: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedUnixGroupsMapping.java --- @@ -22,19 +22,32 @@ import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.CommonConfigurationKeys; +import org.apache.hadoop.test.GenericTestUtils; +import org.apache.hadoop.util.ReflectionUtils; +import org.apache.hadoop.util.Shell; import org.apache.hadoop.util.Shell.ExitCodeException; import org.apache.hadoop.util.Shell.ShellCommandExecutor; import org.junit.Test; + import static org.junit.Assert.*; import static org.mockito.Mockito.doNothing; import static org.mockito.Mockito.doThrow; import static org.mockito.Mockito.mock; import static org.mockito.Mockito.when; public class TestShellBasedUnixGroupsMapping { - private static final Log LOG = + private static final Log TESTLOG = LogFactory.getLog(TestShellBasedUnixGroupsMapping.class); + private final GenericTestUtils.LogCapturer shellMappingLog = + GenericTestUtils.LogCapturer.captureLogs( + ShellBasedUnixGroupsMapping.LOG); + + private static final boolean WINDOWS = + (Shell.osType == Shell.OSType.OS_TYPE_WIN); --- End diff -- I think you can just use Shell.WINDOWS instead. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870233#comment-15870233 ] Wei-Chiu Chuang commented on HADOOP-13817: -- Hey [~qwertymaniac] this is a big patch, but sure I'll review it today. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870058#comment-15870058 ] Harsh J commented on HADOOP-13817: -- [~jojochuang] - Could you help in reviewing this? Current diff is at https://github.com/apache/hadoop/pull/161 > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694982#comment-15694982 ] Harsh J commented on HADOOP-13817: -- [~xyao] - Thanks for the reviews so far. Could you take a look at the current patch-set? [~yzhangal] - Could you help take a look at this one? > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15672263#comment-15672263 ] ASF GitHub Bot commented on HADOOP-13817: - Github user QwertyManiac commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r88366053 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -51,14 +52,16 @@ LoggerFactory.getLogger(ShellBasedUnixGroupsMapping.class); private long timeout = 0L; + private final List emptyGroupsList = new LinkedList<>(); --- End diff -- Done in the latest commit. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15672262#comment-15672262 ] ASF GitHub Bot commented on HADOOP-13817: - Github user QwertyManiac commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r88366026 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java --- @@ -517,15 +517,15 @@ * * core-default.xml */ - public static final String HADOOP_SECURITY_GROUP_SHELL_COMMAND_TIMEOUT = - "hadoop.security.groups.shell.groups.command.timeout"; + public static final String HADOOP_SECURITY_GROUP_SHELL_COMMAND_TIMEOUT_SECS = + "hadoop.security.groups.shell.command.timeout.secs"; --- End diff -- I'd just felt it would be clearer to have '.secs' in it indicating the minimum unit type despite the parser. I've removed that in the latest commit and added some examples into its doc-text in the xml. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667914#comment-15667914 ] ASF GitHub Bot commented on HADOOP-13817: - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r88084092 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -51,14 +52,16 @@ LoggerFactory.getLogger(ShellBasedUnixGroupsMapping.class); private long timeout = 0L; + private final List emptyGroupsList = new LinkedList<>(); --- End diff -- This can be static. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667901#comment-15667901 ] ASF GitHub Bot commented on HADOOP-13817: - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r88082752 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java --- @@ -517,15 +517,15 @@ * * core-default.xml */ - public static final String HADOOP_SECURITY_GROUP_SHELL_COMMAND_TIMEOUT = - "hadoop.security.groups.shell.groups.command.timeout"; + public static final String HADOOP_SECURITY_GROUP_SHELL_COMMAND_TIMEOUT_SECS = + "hadoop.security.groups.shell.command.timeout.secs"; --- End diff -- I think we should keep "hadoop.security.groups.shell.groups.command.timeout". Sorry I was not clear about this when suggesting Configuration#getTimeDuration. It allows admin to specify the time values with suffix like 10s, 1m, 2h. So the ."secs" won't be needed here. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667072#comment-15667072 ] ASF GitHub Bot commented on HADOOP-13817: - Github user QwertyManiac commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r88010241 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -133,8 +171,26 @@ protected ShellCommandExecutor createGroupIDExecutor(String userName) { groups = resolvePartialGroupNames(user, e.getMessage(), executor.getOutput()); } catch (PartialGroupNameException pge) { -LOG.warn("unable to return groups for user " + user, pge); +LOG.warn("unable to return groups for user {}", user, pge); +return new LinkedList<>(); --- End diff -- Done in added commit. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667071#comment-15667071 ] ASF GitHub Bot commented on HADOOP-13817: - Github user QwertyManiac commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r88010210 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -37,11 +43,23 @@ */ @InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"}) @InterfaceStability.Evolving -public class ShellBasedUnixGroupsMapping +public class ShellBasedUnixGroupsMapping extends Configured implements GroupMappingServiceProvider { - - private static final Log LOG = -LogFactory.getLog(ShellBasedUnixGroupsMapping.class); + + @VisibleForTesting + protected static final Logger LOG = + LoggerFactory.getLogger(ShellBasedUnixGroupsMapping.class); + + private long timeout = 0L; + + @Override + public void setConf(Configuration conf) { +super.setConf(conf); +timeout = conf.getLong( --- End diff -- Done in added commit. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664544#comment-15664544 ] ASF GitHub Bot commented on HADOOP-13817: - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r87855020 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -133,8 +171,26 @@ protected ShellCommandExecutor createGroupIDExecutor(String userName) { groups = resolvePartialGroupNames(user, e.getMessage(), executor.getOutput()); } catch (PartialGroupNameException pge) { -LOG.warn("unable to return groups for user " + user, pge); +LOG.warn("unable to return groups for user {}", user, pge); +return new LinkedList<>(); + } +} catch (IOException ioe) { + // If its a shell executor timeout, indicate so in the message + // but treat the result as empty instead of throwing it up, + // similar to how partial resolution failures are handled above + if (executor.isTimedOut()) { +LOG.warn( +"Unable to return groups for user '{}' as shell group lookup " + +"command '{}' ran longer than the configured timeout limit of " + +"{} seconds.", +user, +Arrays.asList(executor.getExecString()), +timeout +); return new LinkedList<>(); --- End diff -- flyweight pattern applies here as well. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664541#comment-15664541 ] ASF GitHub Bot commented on HADOOP-13817: - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r87854835 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -133,8 +171,26 @@ protected ShellCommandExecutor createGroupIDExecutor(String userName) { groups = resolvePartialGroupNames(user, e.getMessage(), executor.getOutput()); } catch (PartialGroupNameException pge) { -LOG.warn("unable to return groups for user " + user, pge); +LOG.warn("unable to return groups for user {}", user, pge); +return new LinkedList<>(); --- End diff -- Can we use flyweight pattern to minimize memory usage for the empty LinkList? > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663747#comment-15663747 ] ASF GitHub Bot commented on HADOOP-13817: - GitHub user QwertyManiac opened a pull request: https://github.com/apache/hadoop/pull/161 HADOOP-13817. Add a finite shell command timeout to ShellBasedUnixGroupsMapping. - Tests required log capture so the LOG variable was elevated in visibility, which required changes in a few test methods unrelated to just this change. - Timeout is applied to both, the regular groups and the partial group listing check commands. - Default is left to 0 to not break current behaviour (i.e. to wait forever until groups are resolvable). You can merge this pull request into a Git repository by running: $ git pull https://github.com/QwertyManiac/hadoop HADOOP-13817 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/161.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #161 commit 6707438a05d46ac57b230d4e9ade59a1a935c9c3 Author: Harsh JDate: 2016-11-14T10:29:58Z HADOOP-13817. Add a finite shell command timeout to ShellBasedUnixGroupsMapping. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > - > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org