Aman Raj created SOLR-17220:
-------------------------------

             Summary: SolrZkClient should be a daemon-thread
                 Key: SOLR-17220
                 URL: https://issues.apache.org/jira/browse/SOLR-17220
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Aman Raj


I am submitting a SparkSql job through spark-submit. Spark version 3.3.1 and 
Kyuubi version 1.8.0. I am using Open Source Spark Engine with Kyuubi Authz 
module running on top of the Spark Driver in client mode. The Spark job is 
successful, but the Spark Driver does not stop and keeps on running and I see 
the PolicyRefresher keeps on polling policies from Ranger.

[!https://private-user-images.githubusercontent.com/104416558/317133070-5bf7e9af-24d3-4ffb-8239-53eae0bd88fc.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTE2NDM0MDQsIm5iZiI6MTcxMTY0MzEwNCwicGF0aCI6Ii8xMDQ0MTY1NTgvMzE3MTMzMDcwLTViZjdlOWFmLTI0ZDMtNGZmYi04MjM5LTUzZWFlMGJkODhmYy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwMzI4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDMyOFQxNjI1MDRaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT00NjJjY2RhOGRhZjU1M2Q3YmM2Y2Q0MzRmYWEzOTlkMGU1ODkyNTkzYzMyN2ZlZTBlMGRiOTI4MmQzNmJmYWE1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.8mUQhNXUL9KOEoXWJhw3vI8eUaCZDFvjwgRzvqwlvEY!|https://private-user-images.githubusercontent.com/104416558/317133070-5bf7e9af-24d3-4ffb-8239-53eae0bd88fc.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTE2NDM0MDQsIm5iZiI6MTcxMTY0MzEwNCwicGF0aCI6Ii8xMDQ0MTY1NTgvMzE3MTMzMDcwLTViZjdlOWFmLTI0ZDMtNGZmYi04MjM5LTUzZWFlMGJkODhmYy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwMzI4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDMyOFQxNjI1MDRaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT00NjJjY2RhOGRhZjU1M2Q3YmM2Y2Q0MzRmYWEzOTlkMGU1ODkyNTkzYzMyN2ZlZTBlMGRiOTI4MmQzNmJmYWE1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.8mUQhNXUL9KOEoXWJhw3vI8eUaCZDFvjwgRzvqwlvEY]

If you see the logs, PolicyRefresher is still running even after Spark Context 
has stopped. This is leading to the Spark Driver not ending and therefore after 
sometime I have to manually kill the job.

The issue is that the SolrZkClient used for logging to Solr currently opens up 
a non-daemon thread on top of Spark Driver. When Spark Driver is stuck I 
collected the jstack of non-daemon threads as follows:


 
{{"zkConnectionManagerCallback-5-thread-1" #202 prio=5 os_prio=0 
tid=0x00007f1cbc003000 nid=0x4ca7 waiting on condition [0x00007f1bc3a01000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000007b218dad8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
        

"DestroyJavaVM" #453 prio=5 os_prio=0 tid=0x00007f1dfc019000 nid=0x4ab5 waiting 
on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   

"VM Thread" os_prio=0 tid=0x00007f1dfc09a000 nid=0x4ac3 runnable

"VM Periodic Task Thread" os_prio=0 tid=0x00007f1dfc10e800 nid=0x4ad4 waiting 
on condition}}

 

So when Spark Context stops, SolrZkClient does not let Spark Driver to exit 
since currently it is of type non-daemon as shown below :
!https://private-user-images.githubusercontent.com/104416558/317615386-a0c14b53-4725-46c7-b762-658a838606c8.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTE2NDM0MDUsIm5iZiI6MTcxMTY0MzEwNSwicGF0aCI6Ii8xMDQ0MTY1NTgvMzE3NjE1Mzg2LWEwYzE0YjUzLTQ3MjUtNDZjNy1iNzYyLTY1OGE4Mzg2MDZjOC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwMzI4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDMyOFQxNjI1MDVaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1kYTMzZWE1YzhmZmI2M2ViNzIxNTU1YTVjNWE1YzZhNWUzOWJjMDVmNGU0YWEzMWRiYzY2OWFhYWU0NTFhZjlmJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.vpKICuUHju3g8U48B4O_8iUDBgjvZaijOS0FoDRPEtY!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to