cxzl25 opened a new pull request, #36301:
URL: https://github.com/apache/spark/pull/36301

   ### What changes were proposed in this pull request?
   Before setting `FsUrlStreamHandlerFactory`, initialize the `LogFactory` in 
`commons-logging` or `jcl-over-slf4j` module in advance.
   
   ### Why are the changes needed?
   
   Since both `commons-logging` and `jcl-over-slf4j` exist in Spark's jars 
directory, use `java -cp jars/*` to start the driver, which cannot guarantee 
which jar the class `org.apache.commons.logging.LogFactory` comes from.
   
   If it comes from `jcl-over-slf4j`, no problem.
   
   But the loaded class comes from `commons-logging`, so there will be a 
problem when loading the hdfs jar.
   Because after setting the `FsUrlStreamHandlerFactory`, when loading the HDFS 
jar, it will use the `BlockReaderFactory` of HDFS to read.
   `BlockReaderFactory` needs to load `LogFactory`. When `LogFactory` is loaded 
for the first time, it will look for the `commons-logging.properties` file. 
   At this time, `BlockReaderFactory` will be called again. Because the log 
object is not initialized properly, there is an NPE exception, and finally 
loading HDFS jar fails.
   
   
   ```
   java.lang.ExceptionInInitializerError
        at 
org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:608)
        at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:566)
   
   
   Caused by: java.lang.NullPointerException
        at 
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:754)
        at 
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:384
   
        at java.net.URLClassLoader$3.hasMoreElements(URLClassLoader.java:623)
        at sun.misc.CompoundEnumeration.next(CompoundEnumeration.java:45)
        at 
sun.misc.CompoundEnumeration.hasMoreElements(CompoundEnumeration.java:54)
        at 
org.apache.commons.logging.LogFactory.getConfigurationFile(LogFactory.java:1409)
        at org.apache.commons.logging.LogFactory.getFactory(LogFactory.java:455)
        at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:657)
        at 
org.apache.hadoop.hdfs.BlockReaderFactory.<clinit>(BlockReaderFactory.java:78)
   ```
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   local test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to