[
https://issues.apache.org/jira/browse/HADOOP-17079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272430#comment-17272430
]
Xiaoyu Yao commented on HADOOP-17079:
-------------------------------------
Thanks [~daryn] for the comments. Here are my thoughts on adding a new method
for GroupCacheLoader#getGroupsSet.
Many GroupMappingServiceProvider implementations have already used Set
internally (e.g., LdapGroupsMapping#lookupGroup) or use additional step to
dedup the list (e.g., ShellBasedUnixGroupsMapping). It is expensive to convert
between Set and List back-and-forth with the the existing list-based
getGroups() method in GroupMappingServiceProvider interface .
Can you elaborate the proposal to change GroupCacheLoader#load? Can we avoid
the two conversions?
Set -> List ((GroupMappingServiceProvider Impl))
and List->Set (GroupCacheLoader).
> Optimize UGI#getGroups by adding UGI#getGroupsSet
> -------------------------------------------------
>
> Key: HADOOP-17079
> URL: https://issues.apache.org/jira/browse/HADOOP-17079
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Xiaoyu Yao
> Assignee: Xiaoyu Yao
> Priority: Major
> Fix For: 3.4.0
>
> Attachments: HADOOP-17079.002.patch, HADOOP-17079.003.patch,
> HADOOP-17079.004.patch, HADOOP-17079.005.patch, HADOOP-17079.006.patch,
> HADOOP-17079.007.patch
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> UGI#getGroups has been optimized with HADOOP-13442 by avoiding the
> List->Set->List conversion. However the returned list is not optimized to
> contains lookup, especially the user's group membership list is huge
> (thousands+) . This ticket is opened to add a UGI#getGroupsSet and use
> Set#contains() instead of List#contains() to speed up large group look up
> while minimize List->Set conversions in Groups#getGroups() call.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]