[jira] [Commented] (RANGER-2993) Syncing AD/LDAP groups with special characters causing Usersync to get stuck

2020-09-14 Thread Georgi Ivanov (Jira)


[ 
https://issues.apache.org/jira/browse/RANGER-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195355#comment-17195355
 ] 

Georgi Ivanov commented on RANGER-2993:
---

Hi, I was able to bypass the special characters by crafting an LDAP filter that 
excludes the group in question. The error was not returned by UserSync, however 
the overall behaviour was kept, i.e. usersync finished successfully the initial 
run, but did not did a delta run, i.e. it "got stuck" in the sleep and never 
recovered.

> Syncing AD/LDAP groups with special characters causing Usersync to get stuck
> 
>
> Key: RANGER-2993
> URL: https://issues.apache.org/jira/browse/RANGER-2993
> Project: Ranger
>  Issue Type: Bug
>  Components: usersync
>Affects Versions: 2.0.0
>Reporter: Georgi Ivanov
>Priority: Major
>
> We are running Ranger on kubernetes. The usersync component runs in a 
> separate pod as a standalone component. During the initial sync, it throws an 
> error about a AD/LDAP group that contains a special character.
>  
> {code:java}
> 10 Sep 2020 03:24:01 ERROR LdapDeltaUserGroupBuilder [UnixUserSyncThread] - 
> sink.addOrUpdateGroup failed with exception: Failed to add addorUpdate group 
> user info, for group: s-TFxLabRun%, users: [...] 
> 10 Sep 2020 03:24:01 ERROR LdapPolicyMgrUserGroupBuilder [UnixUserSyncThread] 
> - Failed to add addorUpdate group user info{code}
> And after that the sync does not continue to the next cycle. 
> After 3-4 hours after this error (no logs from the LdapDeltaUserGroupBuilder 
> or LdapPolicyMgrUserGroupBuilder during that time) we see this log entry
> {code:java}
> 10 Sep 2020 06:38:27 INFO UserGroupSync [UnixUserSyncThread] - End: initial 
> load of user/group from source==>sink 
> 10 Sep 2020 06:38:27 INFO UserGroupSync [UnixUserSyncThread] - Done 
> initializing user/group source and sink{code}
>  
> And no more logs after that
> A strace on the process shows it's stuck in a sleep
> {code:java}
> # jps
> 226 UnixAuthenticationService
> 2242 Jps
> # strace -p 226
> strace: Process 226 attached
> futex(0x7f637e9149d0, FUTEX_WAIT, 227, NULL {code}
>  
> Jstack also shows the what the UnixUserSyncThread is in waiting state 
> (sleeping). There are some locked threads but I don't think they are related 
> to the bug.
>  
> {noformat}
> # jstack 226
> 2020-09-11 11:22:37
> Full thread dump OpenJDK 64-Bit Server VM (25.232-b09 mixed mode):"Attach 
> Listener" #1657 daemon prio=9 os_prio=0 tid=0x7f6350001000 nid=0x798 
> waiting on condition [0x]
>java.lang.Thread.State: 
> RUNNABLE"org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner"
>  #12 daemon prio=5 os_prio=0 tid=0x7f633c2ac800 nid=0xf0 in Object.wait() 
> [0x7f636626a000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0xd56d46e0> (a java.lang.ref.ReferenceQueue$Lock)
>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
>   - locked <0xd56d46e0> (a java.lang.ref.ReferenceQueue$Lock)
>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
>   at 
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3806)
>   at java.lang.Thread.run(Thread.java:748)"UnixUserSyncThread" #8 prio=5 
> os_prio=0 tid=0x7f6378321800 nid=0xec waiting on condition 
> [0x7f63663a4000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.ranger.usergroupsync.UserGroupSync.run(UserGroupSync.java:79)
>   at java.lang.Thread.run(Thread.java:748)"Service Thread" #7 daemon 
> prio=9 os_prio=0 tid=0x7f63780b4800 nid=0xea runnable [0x]
>java.lang.Thread.State: RUNNABLE"C1 CompilerThread1" #6 daemon prio=9 
> os_prio=0 tid=0x7f63780b1000 nid=0xe9 waiting on condition 
> [0x]
>java.lang.Thread.State: RUNNABLE"C2 CompilerThread0" #5 daemon prio=9 
> os_prio=0 tid=0x7f63780af000 nid=0xe8 waiting on condition 
> [0x]
>java.lang.Thread.State: RUNNABLE"Signal Dispatcher" #4 daemon prio=9 
> os_prio=0 tid=0x7f63780ad800 nid=0xe7 runnable [0x]
>java.lang.Thread.State: RUNNABLE"Finalizer" #3 daemon prio=8 os_prio=0 
> tid=0x7f637807c000 nid=0xe6 in Object.wait() [0x7f6366e9]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
>   - locked <0xd5588a40> (a java.lang.ref.ReferenceQueue$Lock)
>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
>   at 
> 

[jira] [Commented] (RANGER-2993) Syncing AD/LDAP groups with special characters causing Usersync to get stuck

2020-09-11 Thread Velmurugan Periasamy (Jira)


[ 
https://issues.apache.org/jira/browse/RANGER-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17194292#comment-17194292
 ] 

Velmurugan Periasamy commented on RANGER-2993:
--

[~mollonado] - I don't think this is specific to kubernets. Did you see this 
issue in other env too? CC [~spolavarapu]

> Syncing AD/LDAP groups with special characters causing Usersync to get stuck
> 
>
> Key: RANGER-2993
> URL: https://issues.apache.org/jira/browse/RANGER-2993
> Project: Ranger
>  Issue Type: Bug
>  Components: usersync
>Affects Versions: 2.0.0
>Reporter: Georgi Ivanov
>Priority: Major
>
> We are running Ranger on kubernetes. The usersync component runs in a 
> separate pod as a standalone component. During the initial sync, it throws an 
> error about a AD/LDAP group that contains a special character.
>  
> {code:java}
> 10 Sep 2020 03:24:01 ERROR LdapDeltaUserGroupBuilder [UnixUserSyncThread] - 
> sink.addOrUpdateGroup failed with exception: Failed to add addorUpdate group 
> user info, for group: s-TFxLabRun%, users: [...] 
> 10 Sep 2020 03:24:01 ERROR LdapPolicyMgrUserGroupBuilder [UnixUserSyncThread] 
> - Failed to add addorUpdate group user info{code}
> And after that the sync does not continue to the next cycle. 
> After 3-4 hours after this error (no logs from the LdapDeltaUserGroupBuilder 
> or LdapPolicyMgrUserGroupBuilder during that time) we see this log entry
> {code:java}
> 10 Sep 2020 06:38:27 INFO UserGroupSync [UnixUserSyncThread] - End: initial 
> load of user/group from source==>sink 
> 10 Sep 2020 06:38:27 INFO UserGroupSync [UnixUserSyncThread] - Done 
> initializing user/group source and sink{code}
>  
> And no more logs after that
> A strace on the process shows it's stuck in a sleep
> {code:java}
> # jps
> 226 UnixAuthenticationService
> 2242 Jps
> # strace -p 226
> strace: Process 226 attached
> futex(0x7f637e9149d0, FUTEX_WAIT, 227, NULL {code}
>  
> Jstack also shows the what the UnixUserSyncThread is in waiting state 
> (sleeping). There are some locked threads but I don't think they are related 
> to the bug.
>  
> {noformat}
> # jstack 226
> 2020-09-11 11:22:37
> Full thread dump OpenJDK 64-Bit Server VM (25.232-b09 mixed mode):"Attach 
> Listener" #1657 daemon prio=9 os_prio=0 tid=0x7f6350001000 nid=0x798 
> waiting on condition [0x]
>java.lang.Thread.State: 
> RUNNABLE"org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner"
>  #12 daemon prio=5 os_prio=0 tid=0x7f633c2ac800 nid=0xf0 in Object.wait() 
> [0x7f636626a000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0xd56d46e0> (a java.lang.ref.ReferenceQueue$Lock)
>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
>   - locked <0xd56d46e0> (a java.lang.ref.ReferenceQueue$Lock)
>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
>   at 
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3806)
>   at java.lang.Thread.run(Thread.java:748)"UnixUserSyncThread" #8 prio=5 
> os_prio=0 tid=0x7f6378321800 nid=0xec waiting on condition 
> [0x7f63663a4000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.ranger.usergroupsync.UserGroupSync.run(UserGroupSync.java:79)
>   at java.lang.Thread.run(Thread.java:748)"Service Thread" #7 daemon 
> prio=9 os_prio=0 tid=0x7f63780b4800 nid=0xea runnable [0x]
>java.lang.Thread.State: RUNNABLE"C1 CompilerThread1" #6 daemon prio=9 
> os_prio=0 tid=0x7f63780b1000 nid=0xe9 waiting on condition 
> [0x]
>java.lang.Thread.State: RUNNABLE"C2 CompilerThread0" #5 daemon prio=9 
> os_prio=0 tid=0x7f63780af000 nid=0xe8 waiting on condition 
> [0x]
>java.lang.Thread.State: RUNNABLE"Signal Dispatcher" #4 daemon prio=9 
> os_prio=0 tid=0x7f63780ad800 nid=0xe7 runnable [0x]
>java.lang.Thread.State: RUNNABLE"Finalizer" #3 daemon prio=8 os_prio=0 
> tid=0x7f637807c000 nid=0xe6 in Object.wait() [0x7f6366e9]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
>   - locked <0xd5588a40> (a java.lang.ref.ReferenceQueue$Lock)
>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
>   at 
> java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)"Reference 
> Handler" #2 daemon prio=10 os_prio=0 tid=0x7f6378079800 nid=0xe5 in 
> Object.wait() [0x7f6366f91000]
>java.lang.Thread.State: