[jira] [Commented] (HADOOP-8088) User-group mapping cache incorrectly does negative caching on transient failures

Kihwal Lee (Commented) (JIRA) Fri, 16 Mar 2012 10:08:05 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231394#comment-13231394
 ]


Kihwal Lee commented on HADOOP-8088:
------------------------------------

We use a refresh cycle of 4 hrs. It becomes a serious issue when headless users 
associated with tight SLA jobs get in there. This behavior is not a 
hypothetical one. We've seen this in production. 

I do agree with you on the problem of having too many configs, but we often end 
up going this route in (sometimes justifiable) fear of breaking compatibility 
even if the bug is clearly there and in need of fix. I would appreciate your 
further input on how we can address the issue with the minimum negative impact. 
                
> User-group mapping cache incorrectly does negative caching on transient 
> failures
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-8088
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8088
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.20.205.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0
>            Reporter: Kihwal Lee
>             Fix For: 0.24.0, 1.1.0, 0.23.2
>
>         Attachments: hadoop-8088-branch-1.patch, hadoop-8088-trunk.patch, 
> hadoop-8088-trunk.patch
>
>
> We've seen a case where some getGroups() calls fail when the ldap server or 
> the network is having transient failures. Looking at the code, the 
> shell-based and the JNI-based implementations swallow exceptions and return 
> an empty or partial list. The caller, Groups#getGroups() adds this likely 
> empty list into the mapping cache for the user. This will function as 
> negative caching until the cache expires. I don't think we want negative 
> caching here, but even if we do, it should be intelligent enough to 
> distinguish transient failures from ENOENT. The log message in the jni-based 
> impl also needs an improvement. It should print what exception it encountered 
> instead of just saying one happened.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8088) User-group mapping cache incorrectly does negative caching on transient failures

Reply via email to