[ 
https://issues.apache.org/jira/browse/RANGER-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wojciech Gasior updated RANGER-5620:
------------------------------------
    Description: 
On {*}usersync restart{*}, the following error surfaces and causes group 
memberships to be dropped for a large number of users, impacting their access:
 
 {{ERROR o.a.r.l.p.LdapUserGroupBuilder [UnixUserSyncThread] - Failed to update 
ranger admin.Will retry in next sync cycle!!java.lang.Exception: Failed to 
addorUpdate users to ranger admin at 
org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.addOrUpdateUsers(PolicyMgrUserGroupBuilder.java:588)
 at 
org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.addOrUpdateUsersGroups(PolicyMgrUserGroupBuilder.java:329)
 at 
org.apache.ranger.ldapusersync.process.LdapUserGroupBuilder.updateSink(LdapUserGroupBuilder.java:417)
 at 
org.apache.ranger.usergroupsync.UserGroupSync.syncUserGroup(UserGroupSync.java:115)}}
In the initial failure instance, the root cause was identified as a *firewall 
policy blocking LDAP calls* from the Ranger container - DNS was resolving to EU 
GC IPs, but firewall rules only allowed NA GC IPs. When the LDAP phase returned 
0 groups and 0 users, the {{PolicyMgrUserGroupBuilder}} hit a null delta map 
path:
 
 {{java.lang.NullPointerException: Cannot invoke "java.util.Map.get(Object)" 
because "this.deltaGroupUsers" is null at 
org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.addOrUpdateUsersGroups(PolicyMgrUserGroupBuilder.java:372)}}
Production has been restored, but the {*}"Failed to addOrUpdate users to ranger 
admin" error recurs on each usersync restart{*}, causing group associations to 
be cleared for a large number of users.
h2. Customer Impact
 * *Severity:* High (production access impacted during incident; ongoing 
restart behavior clears group associations)

 * *Affected users:* Large portion of ~6,800 expected Starburst users lose 
group associations on restart

 * *Behavior:* Usersync should not be clearing existing group associations on 
restart; this is an unexpected regression

h2. Root Cause Analysis

The {{PolicyMgrUserGroupBuilder.addOrUpdateUsersGroups()}} method does not 
guard against a null {{deltaGroupUsers}} map when the LDAP source returns an 
empty result set (0 groups, 0 users). Instead of gracefully skipping the update 
cycle, it throws an NPE that propagates as a generic "Failed to addorUpdate 
users" error. The null delta path then causes existing user-group associations 
in Ranger to be dropped.

  was:
On {*}usersync restart{*}, the following error surfaces and causes group 
memberships to be dropped for a large number of users, impacting their access:
 
 {{ERROR o.a.r.l.p.LdapUserGroupBuilder [UnixUserSyncThread] - Failed to update 
ranger admin.Will retry in next sync cycle!!java.lang.Exception: Failed to 
addorUpdate users to ranger admin  at 
org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.addOrUpdateUsers(PolicyMgrUserGroupBuilder.java:588)
  at 
org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.addOrUpdateUsersGroups(PolicyMgrUserGroupBuilder.java:329)
  at 
org.apache.ranger.ldapusersync.process.LdapUserGroupBuilder.updateSink(LdapUserGroupBuilder.java:417)
  at 
org.apache.ranger.usergroupsync.UserGroupSync.syncUserGroup(UserGroupSync.java:115)}}
In the initial failure instance, the root cause was identified as a *firewall 
policy blocking LDAP calls* from the Ranger container - DNS was resolving to EU 
GC IPs, but firewall rules only allowed NA GC IPs. When the LDAP phase returned 
0 groups and 0 users, the {{PolicyMgrUserGroupBuilder}} hit a null delta map 
path:
 
 {{java.lang.NullPointerException: Cannot invoke "java.util.Map.get(Object)" 
because "this.deltaGroupUsers" is null  at 
org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.addOrUpdateUsersGroups(PolicyMgrUserGroupBuilder.java:372)}}
Production has been restored, but the {*}"Failed to addOrUpdate users to ranger 
admin" error recurs on each usersync restart{*}, causing group associations to 
be cleared for a large number of users.
h2. Customer Impact
 * *Severity:* High (production access impacted during incident; ongoing 
restart behavior clears group associations)

 * *Affected users:* Large portion of ~6,800 expected Gilead Starburst users 
lose group associations on restart

 * *Behavior:* Usersync should not be clearing existing group associations on 
restart; this is an unexpected regression

h2. Root Cause Analysis

The {{PolicyMgrUserGroupBuilder.addOrUpdateUsersGroups()}} method does not 
guard against a null {{deltaGroupUsers}} map when the LDAP source returns an 
empty result set (0 groups, 0 users). Instead of gracefully skipping the update 
cycle, it throws an NPE that propagates as a generic "Failed to addorUpdate 
users" error. The null delta path then causes existing user-group associations 
in Ranger to be dropped.


> Ranger UserSync NullPointerException when LDAP returns empty result set 
> causes group associations to be dropped on restart
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: RANGER-5620
>                 URL: https://issues.apache.org/jira/browse/RANGER-5620
>             Project: Ranger
>          Issue Type: Bug
>          Components: Ranger
>    Affects Versions: 2.5.0, 2.6.0, 2.7.0, 2.8.0
>            Reporter: Wojciech Gasior
>            Priority: Major
>
> On {*}usersync restart{*}, the following error surfaces and causes group 
> memberships to be dropped for a large number of users, impacting their access:
>  
>  {{ERROR o.a.r.l.p.LdapUserGroupBuilder [UnixUserSyncThread] - Failed to 
> update ranger admin.Will retry in next sync cycle!!java.lang.Exception: 
> Failed to addorUpdate users to ranger admin at 
> org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.addOrUpdateUsers(PolicyMgrUserGroupBuilder.java:588)
>  at 
> org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.addOrUpdateUsersGroups(PolicyMgrUserGroupBuilder.java:329)
>  at 
> org.apache.ranger.ldapusersync.process.LdapUserGroupBuilder.updateSink(LdapUserGroupBuilder.java:417)
>  at 
> org.apache.ranger.usergroupsync.UserGroupSync.syncUserGroup(UserGroupSync.java:115)}}
> In the initial failure instance, the root cause was identified as a *firewall 
> policy blocking LDAP calls* from the Ranger container - DNS was resolving to 
> EU GC IPs, but firewall rules only allowed NA GC IPs. When the LDAP phase 
> returned 0 groups and 0 users, the {{PolicyMgrUserGroupBuilder}} hit a null 
> delta map path:
>  
>  {{java.lang.NullPointerException: Cannot invoke "java.util.Map.get(Object)" 
> because "this.deltaGroupUsers" is null at 
> org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.addOrUpdateUsersGroups(PolicyMgrUserGroupBuilder.java:372)}}
> Production has been restored, but the {*}"Failed to addOrUpdate users to 
> ranger admin" error recurs on each usersync restart{*}, causing group 
> associations to be cleared for a large number of users.
> h2. Customer Impact
>  * *Severity:* High (production access impacted during incident; ongoing 
> restart behavior clears group associations)
>  * *Affected users:* Large portion of ~6,800 expected Starburst users lose 
> group associations on restart
>  * *Behavior:* Usersync should not be clearing existing group associations on 
> restart; this is an unexpected regression
> h2. Root Cause Analysis
> The {{PolicyMgrUserGroupBuilder.addOrUpdateUsersGroups()}} method does not 
> guard against a null {{deltaGroupUsers}} map when the LDAP source returns an 
> empty result set (0 groups, 0 users). Instead of gracefully skipping the 
> update cycle, it throws an NPE that propagates as a generic "Failed to 
> addorUpdate users" error. The null delta path then causes existing user-group 
> associations in Ranger to be dropped.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to