[
https://issues.apache.org/jira/browse/HADOOP-13442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daryn Sharp updated HADOOP-13442:
---------------------------------
Attachment: HADOOP-13442.patch
# Changed group provider to cache the de-dupped list instead of the raw list.
# Added new {{UGI#getGroups}} that returns the aforementioned de-duped list
# Changed {{UGI#getPrimaryGroup}} to call {{UGI#getGroups}} to avoid an array
copy
# Removed unnecessary synchronization of {{UGI#getGroups}} method. Required
minor tweak to {{Groups#getGroups}} to be thread-safe. Already used elsewhere
w/o synch, so this just makes it safe. Reduces contention with cached
token->ugi instances.
> Optimize UGI group lookups
> --------------------------
>
> Key: HADOOP-13442
> URL: https://issues.apache.org/jira/browse/HADOOP-13442
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Attachments: HADOOP-13442.patch
>
>
> {{UGI#getGroups}} and its usage is inefficient. The list is unnecessarily
> converted to multiple collections.
> For _every_ invocation, the {{List<String>}} from the group provider is
> converted into a {{LinkedHashSet<String>}} (to de-dup), back to a
> {{String[]}}. Then callers testing for group membership convert back to a
> {{List<String>}}. This should be done once to reduce allocations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]