[
https://issues.apache.org/jira/browse/HADOOP-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269646#comment-17269646
]
Jim Brennan commented on HADOOP-17485:
--------------------------------------
{quote}The cherry-pick was not clean:
* I needed to add getGroupsSet for both ShellBasedUnixGroupsNetgroupMapping
and \{{JniBasedUnixGroupsNetgroupMapping}}to avoid the bug described in
HADOOP-17467{quote}
Why are you making this change as part of this backporting HADOOP-17079?
Wouldn't it be better to pull back HADOOP-17467 separately when it is done?
{quote} * I had to replace Java-8 lambda expressions to be compatible with JDK7
* I replaced some usages of guava since 2.10 has older versions.
* Yetus generated several errors regarding deprecated getGroups. I replaced
all the calls with getGroupsSet which were mainly in unit tests. cleaning the
deprecated calls is not done in the trunk version.
* LDAPGroupMapping change was not compatible. So, I had to manually replace
getGroups.
* I replaced new HashSet with LinkedHashSet. The latter maintains the order of
insertion. This made the unit tests pass with less changes.
* In the unit tests, I used Assert.Equals(Set1, Set2) to compare between two
sets. Again, this change does not exist in trunk because it never used the
getGroupsSet.{quote}
Seems like most of these would be good to have in trunk as well. Why not make
these changes in trunk and then pull that Jira back to 2.10?
It's confusing to combine changes like this as part of back-porting a single
change. If you need to pull back multiple Jiras at once, that is ok, but I
would not expect so many additional changes in a back-port. We generally try
to minimize the changes when back-porting.
> port UGI#getGroupsSet optimizations into 2.10
> ---------------------------------------------
>
> Key: HADOOP-17485
> URL: https://issues.apache.org/jira/browse/HADOOP-17485
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Ahmed Hussein
> Assignee: Ahmed Hussein
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> HADOOP-17079 introduced an optimization adding a UGI#getGroupsSet and use
> Set#contains() instead of List#contains() to speed up large group look up
> while minimize List->Set conversions in Groups#getGroups() call.
> This ticket is to port the changes into branch-2.10.
>
> CC: [~Jim_Brennan], [~xyao]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]