[ 
https://issues.apache.org/jira/browse/HADOOP-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269646#comment-17269646
 ] 

Jim Brennan commented on HADOOP-17485:
--------------------------------------

{quote}The cherry-pick was not clean:
 * I needed to add getGroupsSet for both ShellBasedUnixGroupsNetgroupMapping 
and \{{JniBasedUnixGroupsNetgroupMapping}}to avoid the bug described in 
HADOOP-17467{quote}
Why are you making this change as part of this backporting HADOOP-17079? 
Wouldn't it be better to pull back HADOOP-17467 separately when it is done?
{quote} * I had to replace Java-8 lambda expressions to be compatible with JDK7
 * I replaced some usages of guava since 2.10 has older versions.
 * Yetus generated several errors regarding deprecated getGroups. I replaced 
all the calls with getGroupsSet which were mainly in unit tests. cleaning the 
deprecated calls is not done in the trunk version.
 * LDAPGroupMapping change was not compatible. So, I had to manually replace 
getGroups.
 * I replaced new HashSet with LinkedHashSet. The latter maintains the order of 
insertion. This made the unit tests pass with less changes.
 * In the unit tests, I used Assert.Equals(Set1, Set2) to compare between two 
sets. Again, this change does not exist in trunk because it never used the 
getGroupsSet.{quote}
Seems like most of these would be good to have in trunk as well. Why not make 
these changes in trunk and then pull that Jira back to 2.10?

It's confusing to combine changes like this as part of back-porting a single 
change. If you need to pull back multiple Jiras at once, that is ok, but I 
would not expect so many additional changes in a back-port.  We generally try 
to minimize the changes when back-porting.

> port UGI#getGroupsSet optimizations into 2.10
> ---------------------------------------------
>
>                 Key: HADOOP-17485
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17485
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Ahmed Hussein
>            Assignee: Ahmed Hussein
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> HADOOP-17079 introduced an optimization adding a UGI#getGroupsSet and use 
> Set#contains() instead of List#contains() to speed up large group look up 
> while minimize List->Set conversions in Groups#getGroups() call.
> This ticket is to port the changes into branch-2.10.
>  
> CC: [~Jim_Brennan], [~xyao]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to