[ 
https://issues.apache.org/jira/browse/HBASE-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460948#comment-13460948
 ] 

Gary Helmling commented on HBASE-6851:
--------------------------------------

bq. Cache is unbounded. Any fear of it getting large? (I suppose the old code 
worked this way so this patch doesn't introduce that issue).

Making the cache bounded/not all memory resident would be a pretty significant 
change and have potentially big (negative) impacts on performance.  There may 
be some cases of many tables + many users where this may start to become a 
problem.  But for the expected usage (maybe hundreds of tables, hundreds of 
users), I wouldn't expect this to be much of an issue.

bq. Are all accesses on GLOBAL_CACHE synchronized? Do they need to be? Could we 
get a NPE again when we swap it in and are accessing it elsewhere concurrently?

No, accesses are not synchronized (and I don't think we want them to be for 
performance reasons).  Since we're rebuilding and re-assigning the variable 
reference, I don't think we're exposed to an NPE here.

bq. Does updateTableCache need synchronize?

The table cache is a ConcurrentSkipListMap so we should be able to reassign 
variables atomically.  But maybe we do want to serialize calls to this as well? 
 Seems like both updateGlobalCache() and updateTableCache() need to be 
synchronized, or neither does.  Since these are only called from the associated 
ZKPermissionWatcher/ZooKeeperWatcher, is there any need to even synchronize 
these to force serialization of the calls?  Does the ZK ordering of the watch 
events give us this already?  If I understand correctly, maybe we don't need 
the synchronization at all.

Should we make the GLOBAL_CACHE reference volatile, though, to ensure all 
threads see it when updated?
                
> Race condition in TableAuthManager.updateGlobalCache()
> ------------------------------------------------------
>
>                 Key: HBASE-6851
>                 URL: https://issues.apache.org/jira/browse/HBASE-6851
>             Project: HBase
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.94.1, 0.96.0
>            Reporter: Gary Helmling
>            Assignee: Gary Helmling
>            Priority: Critical
>         Attachments: HBASE-6851.patch
>
>
> When new global permissions are assigned, there is a race condition, during 
> which further authorization checks relying on global permissions may fail.
> In TableAuthManager.updateGlobalCache(), we have:
> {code:java}
>     USER_CACHE.clear();
>     GROUP_CACHE.clear();
>     try {
>       initGlobal(conf);
>     } catch (IOException e) {
>       // Never happens
>       LOG.error("Error occured while updating the user cache", e);
>     }
>     for (Map.Entry<String,TablePermission> entry : userPerms.entries()) {
>       if (AccessControlLists.isGroupPrincipal(entry.getKey())) {
>         GROUP_CACHE.put(AccessControlLists.getGroupName(entry.getKey()),
>                         new Permission(entry.getValue().getActions()));
>       } else {
>         USER_CACHE.put(entry.getKey(), new 
> Permission(entry.getValue().getActions()));
>       }
>     }
> {code}
> If authorization checks come in following the .clear() but before 
> repopulating, they will fail.
> We should have some synchronization here to serialize multiple updates and 
> use a COW type rebuild and reassign of the new maps.
> This particular issue crept in with the fix in HBASE-6157, so I'm flagging 
> for 0.94 and 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to