cxzl25 commented on PR #36301:
URL: https://github.com/apache/spark/pull/36301#issuecomment-1111105494

   > Maybe so, it uses non-logging APIs from CUL?
   
   Yes, there is some code in hadoop that does not use the CUL api, such as the 
FSNamesystem in hadoop to modify the appender.
   
   >  I'm still not super clear why this helps - does this load the CUL class? 
or the shim from JCL? we want the latter.
   
   Because the implementation of `commons-logging` 
`org.apache.commons.logging.LogFactory`, if there is 
`Thread.getContextClassLoader` has been initialized `LogFactory`, it will not 
be initialized again.
   
   `LogFactory` initialization will look up the `commons-logging.properties` 
file.
   
   When adding two hdfs jar, because the registration of 
`FsUrlStreamHandlerFactory`, so find the `commons-logging.properties` file will 
also go to the hdfs, but because the hdfs read the class has not been 
initialized, triggering the NPE.
   
   
   Because the implementation of the UDF use `jarClassLoader` as the `Thread's 
ContextClassLoader`, so just make sure that the initial `LogFactory` in the 
jarClassLoader on.
   
   
https://github.com/apache/commons-logging/blob/475b8323e58111fdddd5cbfb5967f56ab08f531f/src/main/java/org/apache/commons/logging/LogFactory.java#L418-L453
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to