[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER
[ https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197274#comment-15197274 ] sanjay kenganahalli vamanna commented on HDFS-9956: --- Allen, Thanks for replying,The issue in this jira about timeout that causes the nematode failure. I know you are taking about faster LDAP response using caching daemon, That may speed up the ldap response but is not related to timeout(10 sec). Thanks, Sanjay > LDAP PERFORMANCE ISSUE AND FAIL OVER > > > Key: HDFS-9956 > URL: https://issues.apache.org/jira/browse/HDFS-9956 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: sanjay kenganahalli vamanna > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory and making the namenode to failover. > Instead of failover, we can use the > parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to > time-out and send the failed response back to the user so that we can prevent > name node failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER
[ https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196443#comment-15196443 ] Allen Wittenauer commented on HDFS-9956: is a naming services caching daemon being used or is this just a raw LDAP connection? > LDAP PERFORMANCE ISSUE AND FAIL OVER > > > Key: HDFS-9956 > URL: https://issues.apache.org/jira/browse/HDFS-9956 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: sanjay kenganahalli vamanna > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory and making the namenode to failover. > Instead of failover, we can use the > parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to > time-out and send the failed response back to the user so that we can prevent > name node failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER
[ https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194756#comment-15194756 ] Wei-Chiu Chuang commented on HDFS-9956: --- Sounds to me this is a really bad thing that LDAP group mapping could fail over a name node. We should investigate why this is happening. I've had worked on a new LDAP group mapping implementation, but I've not finished it yet. I'll prioritize that too. > LDAP PERFORMANCE ISSUE AND FAIL OVER > > > Key: HDFS-9956 > URL: https://issues.apache.org/jira/browse/HDFS-9956 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: sanjay kenganahalli vamanna > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory and making the namenode to failover. > Instead of failover, we can use the > parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to > time-out and send the failed response back to the user so that we can prevent > name node failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER
[ https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193640#comment-15193640 ] sanjay kenganahalli vamanna commented on HDFS-9956: --- ha.zookeeper.session-timeout.ms, default is 5 secs, this default has to be greater than hadoop.security.group.mapping.ldap.directory.search.timeout (default 10 sec). We increased "ha.zookeeper.session-timeout.ms" to 20 secs but still have an issue. > LDAP PERFORMANCE ISSUE AND FAIL OVER > > > Key: HDFS-9956 > URL: https://issues.apache.org/jira/browse/HDFS-9956 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: sanjay kenganahalli vamanna > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory and making the namenode to failover. > Instead of failover, we can use the > parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to > time-out and send the failed response back to the user so that we can prevent > name node failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER
[ https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193620#comment-15193620 ] sanjay kenganahalli vamanna commented on HDFS-9956: --- the default 10 secs is not working and still we are facing the same problem from past so many days.We dont want to keep the users in static binding and we dont want to use the unix shell mapping as well. > LDAP PERFORMANCE ISSUE AND FAIL OVER > > > Key: HDFS-9956 > URL: https://issues.apache.org/jira/browse/HDFS-9956 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: sanjay kenganahalli vamanna > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory and making the namenode to failover. > Instead of failover, we can use the > parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to > time-out and send the failed response back to the user so that we can prevent > name node failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER
[ https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193587#comment-15193587 ] Wei-Chiu Chuang commented on HDFS-9956: --- That is implemented in HADOOP-9322, and you should be able to use it since Hadoop 2.1.0-beta, or if you're using CDH, >= CDH 4.3.0. > LDAP PERFORMANCE ISSUE AND FAIL OVER > > > Key: HDFS-9956 > URL: https://issues.apache.org/jira/browse/HDFS-9956 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: sanjay kenganahalli vamanna > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory and making the namenode to failover. > Instead of failover, we can use the > parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to > time-out and send the failed response back to the user so that we can prevent > name node failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER
[ https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193579#comment-15193579 ] sanjay kenganahalli vamanna commented on HDFS-9956: --- Thanks for replying, Which version of hadoop,this parameter is there. > LDAP PERFORMANCE ISSUE AND FAIL OVER > > > Key: HDFS-9956 > URL: https://issues.apache.org/jira/browse/HDFS-9956 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: sanjay kenganahalli vamanna > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory and making the namenode to failover. > Instead of failover, we can use the > parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to > time-out and send the failed response back to the user so that we can prevent > name node failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER
[ https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193536#comment-15193536 ] Wei-Chiu Chuang commented on HDFS-9956: --- Hi [~sanjayvamanna] thanks for reporting the issue and offering workarounds. The parameter {{hadoop.security.group.mapping.ldap.directory.search.timeout}} is supposed to stop queries if it goes over time. Would this parameter work in your scenario? > LDAP PERFORMANCE ISSUE AND FAIL OVER > > > Key: HDFS-9956 > URL: https://issues.apache.org/jira/browse/HDFS-9956 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: sanjay kenganahalli vamanna > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory and making the namenode to failover. > Instead of failover, we can use the > parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to > time-out and send the failed response back to the user so that we can prevent > name node failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)