[ 
https://issues.apache.org/jira/browse/HIVE-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun resolved HIVE-15659.
-----------------------------
    Resolution: Invalid

> StackOverflowError when ClassLoader.loadClass for Spark
> -------------------------------------------------------
>
>                 Key: HIVE-15659
>                 URL: https://issues.apache.org/jira/browse/HIVE-15659
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>    Affects Versions: 2.2.0
>            Reporter: Chao Sun
>
> Sometimes a query needs to process a large number of input files, which could 
> cause the following error:
> {code}
> 17/01/15 09:31:52 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 
> (TID 0, hadoopworker1344-sjc1.prod.uber.internal): 
> java.lang.StackOverflowError
>         at 
> java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:1535)
>         at java.lang.ClassLoader.getClassLoadingLock(ClassLoader.java:463)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:404)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> {code}
> The cause, I think, is that for each input file we may need to load 
> additional jars to the class loader of the current thread. This accumulates 
> with the number of input files. When adding a new class loader, the old class 
> loader will be used as the parent of the new one. 
> See 
> [Utilities#getBaseWork|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L388]
>  for more details.
> One possible solution is to detect duplicated jar paths before creating the 
> new class loader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to