[
https://issues.apache.org/jira/browse/HIVE-14839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15555164#comment-15555164
]
Marta Kuczora commented on HIVE-14839:
--------------------------------------
The patch is attached.
Created review on Review Board.
> Improve the stability of TestSessionManagerMetrics
> --------------------------------------------------
>
> Key: HIVE-14839
> URL: https://issues.apache.org/jira/browse/HIVE-14839
> Project: Hive
> Issue Type: Bug
> Components: Test
> Affects Versions: 2.1.0
> Reporter: Marta Kuczora
> Assignee: Marta Kuczora
> Priority: Minor
> Attachments: HIVE-14839.patch
>
>
> The TestSessionManagerMetrics fails occasionally with the following error:
> {noformat}
> org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
> at
> org.apache.hive.service.cli.session.TestSessionManagerMetrics.testThreadPoolMetrics(TestSessionManagerMetrics.java:98)
> Failed tests:
> TestSessionManagerMetrics.testThreadPoolMetrics:98 expected:<[0]> but
> was:<[1]>
> {noformat}
> This test starts four background threads with a "wait" call in their run
> method. The threads are using the common "barrier" object as lock.
> The expected behaviour is that two threads will be in the async pool (because
> the hive.server2.async.exec.threads is set to 2) and the other two thread
> will be waiting in the queue. This condition is checked like this:
> {noformat}
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE,
> MetricsConstant.EXEC_ASYNC_POOL_SIZE, 2);
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE,
> MetricsConstant.EXEC_ASYNC_QUEUE_SIZE, 2);
> {noformat}
>
> Then a notifyAll is called on the lock object, so the two threads in the pool
> should "wake up" and complete and the other two threads should go from the
> queue to the pool. This is checked like this in the test:
> {noformat}
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE,
> MetricsConstant.EXEC_ASYNC_POOL_SIZE, 2);
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE,
> MetricsConstant.EXEC_ASYNC_QUEUE_SIZE, 0);
> {noformat}
>
> There are two use cases which can cause error in this test:
> # The notifyAll call happens before both threads in the pool are up and
> running and in the "wait" phase.
> In this case the thread which is not up in time will stuck in the pool, so
> the other two threads can not move from the queue to the pool.
> # After the notifyAll call, the threads in the pool "wake up" with some
> delay. So they don't complete and removed from the pool and the other two
> threads are not moved from the queue to the pool until the metrics are
> checked. Therefore the check fails, since the queue is not empty.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)