Konstantin Bereznyakov created HIVE-29642:
---------------------------------------------
Summary: Race in PartitionManagementTask test-only counters causes
intermittent TestPartitionManagement failures
Key: HIVE-29642
URL: https://issues.apache.org/jira/browse/HIVE-29642
Project: Hive
Issue Type: Bug
Reporter: Konstantin Bereznyakov
[https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-6505/6/tests]
{code:java}
Testing / split-04 / PostProcess / testPartitionDiscoveryTransactionalTable –
org.apache.hadoop.hive.metastore.TestPartitionManagement3sErrorexpected:<2> but
was:<1>Stacktracejava.lang.AssertionError: expected:<2> but was:<1> at
org.junit.Assert.fail(Assert.java:89) {code}
A subsequent re-run has passed.
TestPartitionManagement.testPartitionDiscoveryTransactionalTable asserts that
exactly 2 of 3 concurrent tasks are skipped, using test-only static counters in
PartitionManagementTask. The assertion is flaky for two reasons. First, the JVM
may schedule the 3 tasks so they do not actually overlap, in which case no
skips happen at all. Second, the counters themselves are racy: the
skippedAttempts = 0 reset inside the lock can clear the value while other
threads are running skippedAttempts++ outside the lock.
{*}The proposed solution{*}: Remove the counters - they were added for testing
in HIVE-20707 and nothing in production reads them. Have the test verify the
same properties through a Log4j2 ListAppender that observes the existing skip
and discovery log messages.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)