[
https://issues.apache.org/jira/browse/HIVE-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14370824#comment-14370824
]
Chengxiang Li commented on HIVE-10006:
--------------------------------------
[~lirui], yes, we may get memory leak if other HiveInputFormat implementations
is used, but just clean ThreadLocal cache in HiveInputFormat::init does not
work, as there is other code in CombineHiveInputFormat which may invoke
Utilities.getMapWork as well. I add a extra method in Utilities called
getMapWorkWithoutCache in latest patch, so it wouldn't cache MapWork in
ThreadLocal.
[~jxiang], there are not invoked in the same thread, so we need to clean the
ThreadLocal cache sperately.
Besides, is this ThreadLocal MapWork/ReduceWork cache new introduced
optimization? as Utilities::getMapWork/getReduceWork/setMapWork/setReduceWork
is widely used, it's very hard to identify which thread the related code would
be executed and clean the ThreadLocal cache finally, so it may quite easy to
cause memory leak if the methods are not used very carefully.
> RSC has memory leak while execute multi queries.[Spark Branch]
> --------------------------------------------------------------
>
> Key: HIVE-10006
> URL: https://issues.apache.org/jira/browse/HIVE-10006
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Affects Versions: 1.1.0
> Reporter: Chengxiang Li
> Assignee: Chengxiang Li
> Priority: Critical
> Labels: Spark-M5
> Attachments: HIVE-10006.1-spark.patch, HIVE-10006.2-spark.patch,
> HIVE-10006.2-spark.patch, HIVE-10006.3-spark.patch, HIVE-10006.4-spark.patch,
> HIVE-10006.5-spark.patch
>
>
> While execute query with RSC, MapWork/ReduceWork number is increased all the
> time, and lead to OOM at the end.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)