[
https://issues.apache.org/jira/browse/HIVE-29465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated HIVE-29465:
--------------------------------
Description: Currently, Hive cannot enforce avoiding the excessive
spilling, because when the results cache is enabled, the results dir is set to
this cache folder,
[here|https://github.com/apache/hive/blob/399200af7cb11cf6ee3329ebdabe17792e5e7e85/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L7502-L7512],
so a given query's tasks directly write to this path. The only "easy" way to
achieve this is to automatically check the size at runtime (in the tasks), fail
the query, and re-run without query cache enabled.
> Prevent excessive query results cache usage at runtime
> ------------------------------------------------------
>
> Key: HIVE-29465
> URL: https://issues.apache.org/jira/browse/HIVE-29465
> Project: Hive
> Issue Type: Improvement
> Reporter: László Bodor
> Priority: Major
>
> Currently, Hive cannot enforce avoiding the excessive spilling, because when
> the results cache is enabled, the results dir is set to this cache folder,
> [here|https://github.com/apache/hive/blob/399200af7cb11cf6ee3329ebdabe17792e5e7e85/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L7502-L7512],
> so a given query's tasks directly write to this path. The only "easy" way to
> achieve this is to automatically check the size at runtime (in the tasks),
> fail the query, and re-run without query cache enabled.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)