[
https://issues.apache.org/jira/browse/HIVE-28609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HIVE-28609:
----------------------------------
Labels: pull-request-available (was: )
> HiveSequenceFileInputFormat should be cloned or not be cached
> -------------------------------------------------------------
>
> Key: HIVE-28609
> URL: https://issues.apache.org/jira/browse/HIVE-28609
> Project: Hive
> Issue Type: Improvement
> Security Level: Public(Viewable by anyone)
> Reporter: László Bodor
> Priority: Major
> Labels: pull-request-available
>
> HIVE-28530 introduces a ThreadLocal for storing files in
> HiveSequenceFileInputFormat because there was a contention while accessing
> the files in a shared/cached instance. I feel we fixed a problem in a bad
> place. Instead of preventing this instance from being cached, it introduced a
> ThreadLocal, which seems weird and hacky and makes the code reader think that
> the input format instance must be cached, whereas it's not. This format class
> is instantiated by
> [reflection|https://github.com/apache/hive/blob/18f34e75da0141d37d9a8f1cef4f7f64ba09fadb/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java#L229],
> which is quite often cached due to performance reasons. We can still cache
> an instance and clone it (maybe by implementing some interface) to keep
> performance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)