[
https://issues.apache.org/jira/browse/HIVE-27328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated HIVE-27328:
--------------------------------
Description:
The cache introduced in HIVE-22825 is not invalidated in TezAMs which can
eventually lead to query failures if the same table is used in a scenario like
below:
1. CREATE TABLE
2. INSERT OVERWRITE
3. SELECT
4. DROP TABLE
...
in this case if 2) wrote a file like year=2011/base_0000001/bucket_00000_*1*
(task attempt = 1), and in the next iteration it wrote
year=2011/base_0000001/bucket_00000_*0* (task attempt = 0), then acid dirCache
contains an invalid value within the configured time range
*hive.txn.acid.dir.cache.duration*
see !Screenshot 2026-02-09 at 13.50.43.png!
This cache is stored in memory, and the HS2-side is taken care of by
HIVE-26060, but for the TezAMs, we need further improvement to achieve the same.
was:
The cache introduced in HIVE-22825 is not invalidated in TezAMs which can
eventually lead to query failures if the same table is used in a scenario like
below:
1. CREATE TABLE
2. INSERT OVERWRITE
3. SELECT
4. DROP TABLE
...
in this case if 2) wrote a file like year=2011/base_0000001/bucket_00000_*1*
(task attempt = 1), and in the next iteration it wrote
year=2011/base_0000001/bucket_00000_*0* (task attempt = 0), then acid dirCache
contains an invalid value within the configured time range
*hive.txn.acid.dir.cache.duration*
This cache is stored in memory, and the HS2-side is taken care of by
HIVE-26060, but for the TezAMs, we need further improvement to achieve the same.
> Acid dirCache is not invalidated in TezAMs while dropping table
> ---------------------------------------------------------------
>
> Key: HIVE-27328
> URL: https://issues.apache.org/jira/browse/HIVE-27328
> Project: Hive
> Issue Type: Improvement
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
> Labels: pull-request-available
> Attachments: Screenshot 2026-02-09 at 13.50.43.png
>
>
> The cache introduced in HIVE-22825 is not invalidated in TezAMs which can
> eventually lead to query failures if the same table is used in a scenario
> like below:
> 1. CREATE TABLE
> 2. INSERT OVERWRITE
> 3. SELECT
> 4. DROP TABLE
> ...
> in this case if 2) wrote a file like year=2011/base_0000001/bucket_00000_*1*
> (task attempt = 1), and in the next iteration it wrote
> year=2011/base_0000001/bucket_00000_*0* (task attempt = 0), then acid
> dirCache contains an invalid value within the configured time range
> *hive.txn.acid.dir.cache.duration*
>
> see !Screenshot 2026-02-09 at 13.50.43.png!
> This cache is stored in memory, and the HS2-side is taken care of by
> HIVE-26060, but for the TezAMs, we need further improvement to achieve the
> same.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)