[
https://issues.apache.org/jira/browse/HIVE-27117?focusedWorklogId=849092&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-849092
]
ASF GitHub Bot logged work on HIVE-27117:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 04/Mar/23 09:00
Start Date: 04/Mar/23 09:00
Worklog Time Spent: 10m
Work Description: abstractdog commented on code in PR #4096:
URL: https://github.com/apache/hive/pull/4096#discussion_r1125422392
##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##########
@@ -3647,9 +3647,12 @@ long generateCompactionQueueId(Statement stmt) throws
SQLException, MetaExceptio
+ "no record found in next_compaction_queue_id");
}
long id = rs.getLong(1);
- s = "UPDATE \"NEXT_COMPACTION_QUEUE_ID\" SET \"NCQ_NEXT\" = " + (id + 1);
+ s = "UPDATE \"NEXT_COMPACTION_QUEUE_ID\" SET \"NCQ_NEXT\" = " + (id + 1)
+ " WHERE \"NCQ_NEXT\" = " + id;
LOG.debug("Going to execute update <{}>", s);
- stmt.executeUpdate(s);
+ if (stmt.executeUpdate(s) != 1) {
Review Comment:
this looks like a hack in production code while we're targeting test fixes
here, which is a bit concerning
this method is already very confusing: it looks like we cannot guarantee
synchronization regarding NCQ_NEXT (which is okay if we have more instances
running this code, right?), as we're selecting it, doing something here in java
code (increment), then trying to write it back...isn't there a way to use
AUTOINCREMENTed value across the supported RDBMS types?
I mean, this recursive call is not something I really like, we should find a
better synchronization pattern, I think a compaction queue id increment is not
something that happens 100 times per second, so even a RDBMS-level lock is fine
on this table I guess (I don't know this area though)
don't get me wrong, I'm more than happy to see stability fixes, but simply
retrying in production code reminds me of a good old pattern in hive code: "we
have no idea what's going on, so let's retry :) "
Issue Time Tracking
-------------------
Worklog Id: (was: 849092)
Time Spent: 1h (was: 50m)
> Fix compaction related flaky tests
> ----------------------------------
>
> Key: HIVE-27117
> URL: https://issues.apache.org/jira/browse/HIVE-27117
> Project: Hive
> Issue Type: Task
> Reporter: László Végh
> Assignee: László Végh
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h
> Remaining Estimate: 0h
>
> The following tests turned out to be flaky recently:
> *
> org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDropTableAndCompactionConcurrent
> *
> org.apache.hadoop.hive.ql.txn.compactor.TestCompactionMetrics.testInitiatorFailuresCountedCorrectly
> *
> org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez.testMajorCompactionNotPartitionedWithoutBuckets
> *
> org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez.testCompactionWithCreateTableProps
--
This message was sent by Atlassian Jira
(v8.20.10#820010)