[
https://issues.apache.org/jira/browse/SPARK-22374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261444#comment-16261444
]
Steve Loughran commented on SPARK-22374:
----------------------------------------
We need to do something about this, it is dangerously recurrent. Even though
the fix is {{closeAllForUGI}}, we should be able to track it better and start
warning early and meaningfully. For example
# record timestamps of creation
# track total cache size as an instrumented value
# start warning when it gets big
> STS ran into OOM in a secure cluster
> ------------------------------------
>
> Key: SPARK-22374
> URL: https://issues.apache.org/jira/browse/SPARK-22374
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.0
> Reporter: Dongjoon Hyun
> Attachments: 1.png, 2.png, 3.png
>
>
> In a secure cluster, FileSystem.CACHE grows indefinitely.
> *ENVIRONMENT*
> 1. `spark.yarn.principal` and `spark.yarn.keytab` is used.
> 2. Spark Thrift Server run with `doAs` false.
> {code}
> <property>
> <name>hive.server2.enable.doAs</name>
> <value>false</value>
> </property>
> {code}
> With 6GB (-Xmx6144m) options, `HiveConf` consumes 4GB inside FileSystem.CACHE.
> {code}
> 20,030 instances of "org.apache.hadoop.hive.conf.HiveConf", loaded by
> "sun.misc.Launcher$AppClassLoader @ 0x64001c160" occupy 4,418,101,352
> (73.42%) bytes. These instances are referenced from one instance of
> "java.util.HashMap$Node[]", loaded by "<system class loader>"
> {code}
> Please see the attached images.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]