[jira] [Created] (RANGER-3015) Update presto patch to version 333 is not working
Georgi Ivanov created RANGER-3015: - Summary: Update presto patch to version 333 is not working Key: RANGER-3015 URL: https://issues.apache.org/jira/browse/RANGER-3015 Project: Ranger Issue Type: Bug Components: admin Affects Versions: 2.1.0 Reporter: Georgi Ivanov There are 2 issues with the current patch - PatchForPrestoToSupportPresto333_J10038.java # The patch is not working as expected, i.e. it does not update the presto service definition schema in the database to the latest version # Although the patch does not work, it still return successfully and the Ranger patching subsystem thinks that it went successfully and updates the status to 'Y' in x_db_version_h. This is a logical error as it the patch should return false and thus signal that there is a problem. Although an exception is thrown in $RANGER_ADMIN_HOME/ews/logs/ranger_db_patch.log it is generic and just says that there was an error thrown but the stacktrace does not tell us what the cause of the error is. I will explain the cause of issue 1 in the lines below. Regarding issue 2, this looks like it is a systematic problem related to all Ranger Java patches. The java patches all have similar structure: # patch # if there is an error thrown Runtime Exception # catch all exceptions (including the one above) and log an error message {code:java} if(ret==null){ logger.error("Error while updating "+SOME_SERVICE+"service-def"); throw new RuntimeException("Error while updating "+SOME_SERVICE+"service-def"); } } catch(Exception e) { logger.error("Error while updating "+SOME_SERVICE+"service-def", e); } {code} Since we are catching our own exception and just logging it, the Ranger patch subsystem thinks that the patch went through and it updates the version table x_db_version_h and marks the patch as applied REGARDLESS of whether it was applied or not. A poorly written patch will just pass as well as a very well written patch and both will be recorded as 'Y' in the x_db_version_h table which means the patch was applied. I can't comment on why this was decided to be so and why every patch contributor followed suit. Regarding issue 1: Adding a simple {code:java} logger.error("Exception",e); {code} in the try/catch block shows that the error is thrown by the RangerServiceDefValidator class {code:java} 2020-09-25 21:04:49,620 [main] ERROR org.apache.ranger.patch.PatchForPrestoToSupportPresto333_J10038 (PatchForPrestoToSupportPresto333_J10038.java:105) - Exception java.lang.Exception: (0) Validation failure: error code[2007], reason[changing access type name[delete] in access types is not supported], field[access type name], subfield[null], type[semantically incorrect] (1) Validation failure: error code[2007], reason[changing access type name[use] in access types is not supported], field[access type name], subfield[null], type[semantically incorrect] (2) Validation failure: error code[2007], reason[changing access type name[alter] in access types is not supported], field[access type name], subfield[null], type[semantically incorrect] (3) Validation failure: error code[2007], reason[changing access type name[grant] in access types is not supported], field[access type name], subfield[null], type[semantically incorrect] at org.apache.ranger.plugin.model.validation.RangerServiceDefValidator.validate(RangerServiceDefValidator.java:76) at org.apache.ranger.patch.PatchForPrestoToSupportPresto333_J10038.addPresto333Support(PatchForPrestoToSupportPresto333_J10038.java:148) at org.apache.ranger.patch.PatchForPrestoToSupportPresto333_J10038.execLoad(PatchForPrestoToSupportPresto333_J10038.java:103) at org.apache.ranger.patch.BaseLoader.load(BaseLoader.java:96) at org.apache.ranger.patch.BaseLoader$$FastClassBySpringCGLIB$$3c27c16d.invoke() at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:737) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99) at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:283) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:672) at org.apache.ranger.patch.PatchForPrestoToSupportPresto333_J10038$$EnhancerBySpringCGLIB$$ca65a291.load() at
[jira] [Updated] (RANGER-3014) fix for RANGER-2789 breaks current functionality
[ https://issues.apache.org/jira/browse/RANGER-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Georgi Ivanov updated RANGER-3014: -- Description: Since we upgraded to Ranger 2.1.0 in our dev env, we've noticed that user list page in Ranger Admin UI is not showing (or it is very very slow - in the order of tens of minutes). Looking at the commit history we found that the reason was commit *f45054d1b9* which was meant as a performance improvement for RANGER-2789. Our ranger usersync fetches users and groups from AD. Our tree is huge, here are some stats: {code:java} select count(*) from x_user; 43368 select count(*) from x_portal_user; 43366 select count(*) from x_group; 17865 select count(*) from x_group_users; 366180 {code} Looking at the commit *f45054d1b9* what it meant to solve is perform a user lookup and fetching user info such as attributes and group membership in bulk, instead of doing it in a loop, one by one. In order to do that it provided couple of methods and also an override for searchXUsers in service/XUserService.java (before we used the parent method in service/XUserServiceBase.java). The new searchXUsers method (which gets invoked when we call /service/xusers/users REST API, calls populateViewBeans (another new method). It calls the parent method populateViewBeans in XUserServiceBase.java which build a hashmap or users and calls an override of populateViewBeans with input hashmap {code:java} + public List populateViewBeans(List resources) { + List viewBeans = new ArrayList<>(); + if (CollectionUtils.isNotEmpty(resources)) { + Map resourceViewBeanMap = new HashMap<>(resources.size()); + Map viewBeanResourceMap = new HashMap<>(resources.size()); + for (XXUser resource : resources) { + VXUser viewBean = createViewObject(); + viewBean.setCredStoreId(resource.getCredStoreId()); + viewBean.setDescription(resource.getDescription()); + viewBean.setName(resource.getName()); + viewBean.setStatus(resource.getStatus()); + resourceViewBeanMap.put(resource, viewBean); + viewBeanResourceMap.put(viewBean, resource); + viewBeans.add(viewBean); + } + populateViewBeans(resourceViewBeanMap); + mapEntityToViewBeans(viewBeanResourceMap); + } + return viewBeans; + } + + protected void populateViewBeans(Map resourceViewBeanMap) { + mapBaseAttributesToViewBeans(resourceViewBeanMap); + } {code} This in turns calls mapBaseAttributesToViewBeans, which calls daoManager.getXXPortalUser().findAllXPortalUser() and it pulls all users (no matter that we limit the users with a REST call to 25 by default) That's one thing that hampers performance. However the biggest issue is this: {code:java} + @Override + public List populateViewBeans(List xUsers) { + List vObjList = super.populateViewBeans(xUsers); + if (CollectionUtils.isNotEmpty(vObjList) && CollectionUtils.isNotEmpty(xUsers) && xUsers.size() == vObjList.size()) { + Map xUserIdVObjMap = new HashMap<>(xUsers.size()); + for (int i = 0; i < xUsers.size(); ++i) { + VXUser vObj = vObjList.get(i); + XXUser xUser = xUsers.get(i); + vObj.setIsVisible(xUser.getIsVisible()); + xUserIdVObjMap.put(xUser.getId(), vObj); + } + populateGroupList(xUserIdVObjMap); + } + return vObjList; + } {code} We call populateGroupList on the list of users (by default 25) but we call a new method that accepts a map as an input. Inside that method we call {code:java} List allXXGroupUsers = daoManager.getXXGroupUser().getAll(); {code} Which in our case will pull all 366180 group to user membership mappings from x_group_users table. Next we filter through the whole group list just to find all users who have memberships in those group (but we traverse the whole group membership list) {code:java} if (MapUtils.isNotEmpty(xUserIdVObjMap) && CollectionUtils.isNotEmpty(allXXGroupUsers)) { Map> userIdXXGroupUserMap = new HashMap<>(xUserIdVObjMap.size()); for (Map.Entry xUserIdVXUserEntry : xUserIdVObjMap.entrySet()) { Long xUserId = xUserIdVXUserEntry.getKey(); List xxGroupUsers = allXXGroupUsers .stream() .filter(xXGroupUser ->
[jira] [Updated] (RANGER-3014) fix for RANGER-2789 breaks current functionality
[ https://issues.apache.org/jira/browse/RANGER-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Georgi Ivanov updated RANGER-3014: -- Description: Since we upgraded to Ranger 2.1.0 in our dev env, we've noticed that user list page in Ranger Admin UI is not showing (or it is very very slow - in the order of tens of minutes). Looking at the commit history we found that the reason was commit *f45054d1b9* which was meant as a performance improvement for RANGER-2789. Our ranger usersync fetches users and groups from AD. Our tree is huge, here are some stats: {code:java} select count(*) from x_user; 43368 select count(*) from x_portal_user; 43366 select count(*) from x_group; 17865 select count(*) from x_group_users; 366180 {code} Looking at the commit f45054d1b9 what it meant to solve is perform a user lookup and fetching user info such as attributes and group membership in bulk, instead of doing it in a loop, one by one. In order to do that it provided couple of methods and also an override for searchXUsers in service/XUserService.java (before we used the parent method in service/XUserServiceBase.java). The new searchXUsers method (which gets invoked when we call /service/xusers/users REST API, calls populateViewBeans (another new method). It calls the parent method populateViewBeans in XUserServiceBase.java which build a hashmap or users and calls an override of populateViewBeans with input hashmap {code:java} + public List populateViewBeans(List resources) { + List viewBeans = new ArrayList<>(); + if (CollectionUtils.isNotEmpty(resources)) { + Map resourceViewBeanMap = new HashMap<>(resources.size()); + Map viewBeanResourceMap = new HashMap<>(resources.size()); + for (XXUser resource : resources) { + VXUser viewBean = createViewObject(); + viewBean.setCredStoreId(resource.getCredStoreId()); + viewBean.setDescription(resource.getDescription()); + viewBean.setName(resource.getName()); + viewBean.setStatus(resource.getStatus()); + resourceViewBeanMap.put(resource, viewBean); + viewBeanResourceMap.put(viewBean, resource); + viewBeans.add(viewBean); + } + populateViewBeans(resourceViewBeanMap); + mapEntityToViewBeans(viewBeanResourceMap); + } + return viewBeans; + } + + protected void populateViewBeans(Map resourceViewBeanMap) { + mapBaseAttributesToViewBeans(resourceViewBeanMap); + } {code} This in turns calls mapBaseAttributesToViewBeans, which calls daoManager.getXXPortalUser().findAllXPortalUser() and it pulls all users (no matter that we limit the users with a REST call to 25 by default) That's one thing that hampers performance. However the biggest issue is this: {code:java} + @Override + public List populateViewBeans(List xUsers) { + List vObjList = super.populateViewBeans(xUsers); + if (CollectionUtils.isNotEmpty(vObjList) && CollectionUtils.isNotEmpty(xUsers) && xUsers.size() == vObjList.size()) { + Map xUserIdVObjMap = new HashMap<>(xUsers.size()); + for (int i = 0; i < xUsers.size(); ++i) { + VXUser vObj = vObjList.get(i); + XXUser xUser = xUsers.get(i); + vObj.setIsVisible(xUser.getIsVisible()); + xUserIdVObjMap.put(xUser.getId(), vObj); + } + populateGroupList(xUserIdVObjMap); + } + return vObjList; + } {code} We call populateGroupList on the list of users (by default 25) but we call a new method that accepts a map as an input. Inside that method we call {code:java} List allXXGroupUsers = daoManager.getXXGroupUser().getAll(); {code} Which in our case will pull all 366180 group to user membership mappings from x_group_users table. Next we filter through the whole group list just to filler all users who have memberships in those group (but we traverse the whole group membership list) {code:java} if (MapUtils.isNotEmpty(xUserIdVObjMap) && CollectionUtils.isNotEmpty(allXXGroupUsers)) { Map> userIdXXGroupUserMap = new HashMap<>(xUserIdVObjMap.size()); for (Map.Entry xUserIdVXUserEntry : xUserIdVObjMap.entrySet()) { Long xUserId = xUserIdVXUserEntry.getKey(); List xxGroupUsers = allXXGroupUsers .stream() .filter(xXGroupUser ->
[jira] [Created] (RANGER-3014) fix for RANGER-2789 breaks current functionality
Georgi Ivanov created RANGER-3014: - Summary: fix for RANGER-2789 breaks current functionality Key: RANGER-3014 URL: https://issues.apache.org/jira/browse/RANGER-3014 Project: Ranger Issue Type: Bug Components: admin Reporter: Georgi Ivanov Since we upgraded to Ranger 2.1.0 in our dev env, we've noticed that user list page in Ranger Admin UI is not showing (or it is very very slow - in the order of tens of minutes). Looking at the commit history we found that the reason was commit f45054d1b9 which was meant as a performance improvement for RANGER-2789. Our ranger usersync fetches users and groups from AD. Our tree is huge, here are some stats: {code:java} select count(*) from x_user; 43368 select count(*) from x_portal_user; 43366 select count(*) from x_group; 17865 select count(*) from x_group_users; 366180 {code} Looking at the commit f45054d1b9 what it meant to solve is perform a user lookup and fetching user info such as attributes and group membership in bulk, instead of doing it in a loop, one by one. In order to do that it provided couple of methods and also an override for searchXUsers in service/XUserService.java (before we used the parent method in service/XUserServiceBase.java). The new searchXUsers method (which gets invoked when we call /service/xusers/users REST API, calls populateViewBeans (another new method). It calls the parent method populateViewBeans in XUserServiceBase.java which build a hashmap or users and calls an override of populateViewBeans with input hashmap {code:java} + public List populateViewBeans(List resources) { + List viewBeans = new ArrayList<>(); + if (CollectionUtils.isNotEmpty(resources)) { + Map resourceViewBeanMap = new HashMap<>(resources.size()); + Map viewBeanResourceMap = new HashMap<>(resources.size()); + for (XXUser resource : resources) { + VXUser viewBean = createViewObject(); + viewBean.setCredStoreId(resource.getCredStoreId()); + viewBean.setDescription(resource.getDescription()); + viewBean.setName(resource.getName()); + viewBean.setStatus(resource.getStatus()); + resourceViewBeanMap.put(resource, viewBean); + viewBeanResourceMap.put(viewBean, resource); + viewBeans.add(viewBean); + } + populateViewBeans(resourceViewBeanMap); + mapEntityToViewBeans(viewBeanResourceMap); + } + return viewBeans; + } + + protected void populateViewBeans(Map resourceViewBeanMap) { + mapBaseAttributesToViewBeans(resourceViewBeanMap); + } {code} This in turns calls mapBaseAttributesToViewBeans, which calls daoManager.getXXPortalUser().findAllXPortalUser() and it pulls all users (no matter that we limit the users with a REST call to 25 by default) That's one thing that hampers performance. However the biggest issue is this: {code:java} + @Override + public List populateViewBeans(List xUsers) { + List vObjList = super.populateViewBeans(xUsers); + if (CollectionUtils.isNotEmpty(vObjList) && CollectionUtils.isNotEmpty(xUsers) && xUsers.size() == vObjList.size()) { + Map xUserIdVObjMap = new HashMap<>(xUsers.size()); + for (int i = 0; i < xUsers.size(); ++i) { + VXUser vObj = vObjList.get(i); + XXUser xUser = xUsers.get(i); + vObj.setIsVisible(xUser.getIsVisible()); + xUserIdVObjMap.put(xUser.getId(), vObj); + } + populateGroupList(xUserIdVObjMap); + } + return vObjList; + } {code} We call populateGroupList on the list of users (by default 25) but we call a new method that accepts a map as an input. Inside that method we call {code:java} List allXXGroupUsers = daoManager.getXXGroupUser().getAll(); {code} Which in our case will pull all 366180 group to user membership mappings from x_group_users table. Next we filter through the whole group list just to filler all users who have memberships in those group (but we traverse the whole group membership list) {code:java} if (MapUtils.isNotEmpty(xUserIdVObjMap) && CollectionUtils.isNotEmpty(allXXGroupUsers)) { Map> userIdXXGroupUserMap = new HashMap<>(xUserIdVObjMap.size()); for (Map.Entry xUserIdVXUserEntry : xUserIdVObjMap.entrySet()) { Long xUserId =
[jira] [Commented] (RANGER-2993) Syncing AD/LDAP groups with special characters causing Usersync to get stuck
[ https://issues.apache.org/jira/browse/RANGER-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195355#comment-17195355 ] Georgi Ivanov commented on RANGER-2993: --- Hi, I was able to bypass the special characters by crafting an LDAP filter that excludes the group in question. The error was not returned by UserSync, however the overall behaviour was kept, i.e. usersync finished successfully the initial run, but did not did a delta run, i.e. it "got stuck" in the sleep and never recovered. > Syncing AD/LDAP groups with special characters causing Usersync to get stuck > > > Key: RANGER-2993 > URL: https://issues.apache.org/jira/browse/RANGER-2993 > Project: Ranger > Issue Type: Bug > Components: usersync >Affects Versions: 2.0.0 >Reporter: Georgi Ivanov >Priority: Major > > We are running Ranger on kubernetes. The usersync component runs in a > separate pod as a standalone component. During the initial sync, it throws an > error about a AD/LDAP group that contains a special character. > > {code:java} > 10 Sep 2020 03:24:01 ERROR LdapDeltaUserGroupBuilder [UnixUserSyncThread] - > sink.addOrUpdateGroup failed with exception: Failed to add addorUpdate group > user info, for group: s-TFxLabRun%, users: [...] > 10 Sep 2020 03:24:01 ERROR LdapPolicyMgrUserGroupBuilder [UnixUserSyncThread] > - Failed to add addorUpdate group user info{code} > And after that the sync does not continue to the next cycle. > After 3-4 hours after this error (no logs from the LdapDeltaUserGroupBuilder > or LdapPolicyMgrUserGroupBuilder during that time) we see this log entry > {code:java} > 10 Sep 2020 06:38:27 INFO UserGroupSync [UnixUserSyncThread] - End: initial > load of user/group from source==>sink > 10 Sep 2020 06:38:27 INFO UserGroupSync [UnixUserSyncThread] - Done > initializing user/group source and sink{code} > > And no more logs after that > A strace on the process shows it's stuck in a sleep > {code:java} > # jps > 226 UnixAuthenticationService > 2242 Jps > # strace -p 226 > strace: Process 226 attached > futex(0x7f637e9149d0, FUTEX_WAIT, 227, NULL {code} > > Jstack also shows the what the UnixUserSyncThread is in waiting state > (sleeping). There are some locked threads but I don't think they are related > to the bug. > > {noformat} > # jstack 226 > 2020-09-11 11:22:37 > Full thread dump OpenJDK 64-Bit Server VM (25.232-b09 mixed mode):"Attach > Listener" #1657 daemon prio=9 os_prio=0 tid=0x7f6350001000 nid=0x798 > waiting on condition [0x] >java.lang.Thread.State: > RUNNABLE"org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner" > #12 daemon prio=5 os_prio=0 tid=0x7f633c2ac800 nid=0xf0 in Object.wait() > [0x7f636626a000] >java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xd56d46e0> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) > - locked <0xd56d46e0> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165) > at > org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3806) > at java.lang.Thread.run(Thread.java:748)"UnixUserSyncThread" #8 prio=5 > os_prio=0 tid=0x7f6378321800 nid=0xec waiting on condition > [0x7f63663a4000] >java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.ranger.usergroupsync.UserGroupSync.run(UserGroupSync.java:79) > at java.lang.Thread.run(Thread.java:748)"Service Thread" #7 daemon > prio=9 os_prio=0 tid=0x7f63780b4800 nid=0xea runnable [0x] >java.lang.Thread.State: RUNNABLE"C1 CompilerThread1" #6 daemon prio=9 > os_prio=0 tid=0x7f63780b1000 nid=0xe9 waiting on condition > [0x] >java.lang.Thread.State: RUNNABLE"C2 CompilerThread0" #5 daemon prio=9 > os_prio=0 tid=0x7f63780af000 nid=0xe8 waiting on condition > [0x] >java.lang.Thread.State: RUNNABLE"Signal Dispatcher" #4 daemon prio=9 > os_prio=0 tid=0x7f63780ad800 nid=0xe7 runnable [0x] >java.lang.Thread.State: RUNNABLE"Finalizer" #3 daemon prio=8 os_prio=0 > tid=0x7f637807c000 nid=0xe6 in Object.wait() [0x7f6366e9] >java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) > - locked <0xd5588a40> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165) > at >
[jira] [Created] (RANGER-2994) ranger release 2.1.0 cannot compile due to broken Kylin plugin dependency
Georgi Ivanov created RANGER-2994: - Summary: ranger release 2.1.0 cannot compile due to broken Kylin plugin dependency Key: RANGER-2994 URL: https://issues.apache.org/jira/browse/RANGER-2994 Project: Ranger Issue Type: Bug Components: plugins Reporter: Georgi Ivanov Ranger 2.1.0 release cannot compile due to broken kylin maven dependency. the kylin version is set to 2.6.4 in the main project pom.xml 2.6.4 plugin-kylin/pom.xml ranger-kylin-plugin-shim/pom.xml both reference kylin-server-base which has dependency on kylin-datasource-sdk which in turn has dependency on calcite-linq4j-1.16.0-kylin-r2.jar which is not available. mvn org.apache.maven.plugins:maven-dependency-plugin:3.1.2:copy -Dartifact=org.apache.calcite:calcite-linq4j:jar:1.16.0-kylin-r2 -DoutputDirectory=./ fails and [https://mvnrepository.com/artifact/org.apache.kylin/kylin-datasource-sdk/2.6.4] shows the broken dependency. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (RANGER-2993) Syncing AD/LDAP groups with special characters causing Usersync to get stuck
Georgi Ivanov created RANGER-2993: - Summary: Syncing AD/LDAP groups with special characters causing Usersync to get stuck Key: RANGER-2993 URL: https://issues.apache.org/jira/browse/RANGER-2993 Project: Ranger Issue Type: Bug Components: usersync Affects Versions: 2.0.0 Reporter: Georgi Ivanov We are running Ranger on kubernetes. The usersync component runs in a separate pod as a standalone component. During the initial sync, it throws an error about a AD/LDAP group that contains a special character. {code:java} 10 Sep 2020 03:24:01 ERROR LdapDeltaUserGroupBuilder [UnixUserSyncThread] - sink.addOrUpdateGroup failed with exception: Failed to add addorUpdate group user info, for group: s-TFxLabRun%, users: [...] 10 Sep 2020 03:24:01 ERROR LdapPolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to add addorUpdate group user info{code} And after that the sync does not continue to the next cycle. After 3-4 hours after this error (no logs from the LdapDeltaUserGroupBuilder or LdapPolicyMgrUserGroupBuilder during that time) we see this log entry {code:java} 10 Sep 2020 06:38:27 INFO UserGroupSync [UnixUserSyncThread] - End: initial load of user/group from source==>sink 10 Sep 2020 06:38:27 INFO UserGroupSync [UnixUserSyncThread] - Done initializing user/group source and sink{code} And no more logs after that A strace on the process shows it's stuck in a sleep {code:java} # jps 226 UnixAuthenticationService 2242 Jps # strace -p 226 strace: Process 226 attached futex(0x7f637e9149d0, FUTEX_WAIT, 227, NULL {code} Jstack also shows the what the UnixUserSyncThread is in waiting state (sleeping). There are some locked threads but I don't think they are related to the bug. {noformat} # jstack 226 2020-09-11 11:22:37 Full thread dump OpenJDK 64-Bit Server VM (25.232-b09 mixed mode):"Attach Listener" #1657 daemon prio=9 os_prio=0 tid=0x7f6350001000 nid=0x798 waiting on condition [0x] java.lang.Thread.State: RUNNABLE"org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner" #12 daemon prio=5 os_prio=0 tid=0x7f633c2ac800 nid=0xf0 in Object.wait() [0x7f636626a000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xd56d46e0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) - locked <0xd56d46e0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165) at org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3806) at java.lang.Thread.run(Thread.java:748)"UnixUserSyncThread" #8 prio=5 os_prio=0 tid=0x7f6378321800 nid=0xec waiting on condition [0x7f63663a4000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.ranger.usergroupsync.UserGroupSync.run(UserGroupSync.java:79) at java.lang.Thread.run(Thread.java:748)"Service Thread" #7 daemon prio=9 os_prio=0 tid=0x7f63780b4800 nid=0xea runnable [0x] java.lang.Thread.State: RUNNABLE"C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x7f63780b1000 nid=0xe9 waiting on condition [0x] java.lang.Thread.State: RUNNABLE"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x7f63780af000 nid=0xe8 waiting on condition [0x] java.lang.Thread.State: RUNNABLE"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x7f63780ad800 nid=0xe7 runnable [0x] java.lang.Thread.State: RUNNABLE"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x7f637807c000 nid=0xe6 in Object.wait() [0x7f6366e9] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) - locked <0xd5588a40> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x7f6378079800 nid=0xe5 in Object.wait() [0x7f6366f91000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference.tryHandlePending(Reference.java:191) - locked <0xd5588bf8> (a java.lang.ref.Reference$Lock) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)"main" #1 prio=5 os_prio=0 tid=0x7f637800c000 nid=0xe3 runnable [0x7f637e913000] java.lang.Thread.State: RUNNABLE at