[
https://issues.apache.org/jira/browse/HIVE-28530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17883028#comment-17883028
]
Xiaomin Zhang commented on HIVE-28530:
--------------------------------------
Issue seems related to below jira:
https://issues.apache.org/jira/browse/HIVE-21279
In this jira, a new HiveSequenceFileInputFormat was introduced and it has a
volatile field fileStatuses, which is referenced twice in the FetchOperator by
getNextSplits() call, only when query.result.cache is disabled. Unfortunately
this field access is not thread-safe because the HiveSequenceFileInputFormat
object itself is actually a shared object. Due to this, there could be various
failing scenarios such like:
1) One thread set fileStatuses to null, another thread overrides it to its
result files ==> getting a wrong result from another query
2) One thread set fileStatuses to its result files, then another thread
overrides it to null ==> getting empty result
> Fetched result from another query
> ---------------------------------
>
> Key: HIVE-28530
> URL: https://issues.apache.org/jira/browse/HIVE-28530
> Project: Hive
> Issue Type: Bug
> Security Level: Public(Viewable by anyone)
> Components: HiveServer2
> Affects Versions: 3.0.0
> Reporter: Xiaomin Zhang
> Priority: Major
>
> When running Hive load tests, we observed Beeline can fetch wrong query
> result which is from another one running at same time. We ruled out Load
> Balancing issue, because it happened to a single HiveServer2. And we found
> this issue only happens when *hive.query.result.cached.enabled is false.*
> All test queries are in the same format as below:
> {code:java}
> select concat('total record (test_recon_mock_$PID)=',count(*)) as
> count_record from t1t
> {code}
> We randomized the query by replacing the $PID with the Beeline PID and the
> test driver ran 10 Beeline concurrently. The table t1t is static and has a
> few rows. So now the test driver can check if the query result is equal to:
> total record (test_recon_mock_$PID)=2
> When query result cache is disabled, we can see randomly query got a wrong
> result, and can always reproduced. For example, below two queries were
> running in parallel:
> {code:java}
> queryId=hive_20240701103742_ff1adb2d-e9eb-448d-990e-00ab371e9db6): select
> concat('total record (test_recon_mock_21535)=',count(*)) as count_record from
> t1t
> queryId=hive_20240701103742_9bdfff92-89e1-4bcd-88ea-bf73ba5fd93d): select
> concat('total record (test_recon_mock_21566)=',count(*)) as count_record from
> t1t
> {code}
> While the second query is supposed to get below result:
> *total record (test_recon_mock_21566)=2*
> But actually Beeline got below result:
> *total record (test_recon_mock_21535)=2*
> There is no error in the HS2 log.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)