> On Oct. 7, 2016, 2:26 p.m., Aihua Xu wrote: > > service/src/test/org/apache/hive/service/cli/session/TestSessionManagerMetrics.java, > > line 113 > > <https://reviews.apache.org/r/52631/diff/1/?file=1526447#file1526447line113> > > > > Seems you could run into the following case which will lead to a > > unwanted failure. > > > > Assume 3 or 4 backgroundOperation threads call ready.await() first, > > then the call in the main thread (line 112) will have 4 or 5 threads merge. > > Then the main threads get to line 124, seems it will timeout since no > > threads will call ready.await() any more? > > > > Just checked CyclicBarrier doc and not very sure how it should be used > > actually. Maybe I'm wrong.
As far as I understand, in this test only two tasks will run at the same time. The corePoolSize and maximumPoolSize of the ThreadPoolExecutor is the value of the hive.server2.async.exec.threads parameter. It is set to 2 in this test. So when the 4 tasks are submitted, only two of them will be assigned to a thread and get started. The other two will wait in a queue. The run method of the two tasks in the queue won't get called until they are staying in the queue, so I think the case when all 4 tasks call await on the same barrier won't happen. - Marta ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/52631/#review151799 ----------------------------------------------------------- On Oct. 7, 2016, 2 p.m., Marta Kuczora wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/52631/ > ----------------------------------------------------------- > > (Updated Oct. 7, 2016, 2 p.m.) > > > Review request for hive, Aihua Xu, Miklos Csanady, Peter Vary, Szehon Ho, and > Barna Zsombor Klara. > > > Bugs: HIVE-14839 > https://issues.apache.org/jira/browse/HIVE-14839 > > > Repository: hive-git > > > Description > ------- > > The TestSessionManagerMetrics fails occasionally with the following error: > org.junit.ComparisonFailure: expected:<[0]> but was:<[1]> > at > org.apache.hive.service.cli.session.TestSessionManagerMetrics.testThreadPoolMetrics(TestSessionManagerMetrics.java:98) > > Changed the synchronization between the tasks a bit to avoid these failures. > With this patch the test would work do the following steps: > - Submit four tasks. > - Wait with the metrics verification, until the first two tasks are running. > This is done by invoking await on the "ready" barrier. > If, for some reason, the tasks are not started within a timeout period, > make the test fail. > - Make the tasks wait until the metrics are checked. > This is done by invoking await on the "completed" barrier. > - Verify the metrics. > - Let the first two tasks complete by breaking the "complete" barrier. > When the first two tasks completed, the remaining two tasks will be removed > from the queue and started. > - Wait until the remaining tasks are running by using the "ready" barrier > again. > Do the metrics check only if they are started to avoid the failures when > the queue size was not 0. > If, for some reason, the tasks are not started within a timeout period, > make the test fail. > - Verify the metrics. > - Let the remaining tasks complete by breaking the "completed" barrier. > > > Diffs > ----- > > > service/src/test/org/apache/hive/service/cli/session/TestSessionManagerMetrics.java > 5511c54 > > Diff: https://reviews.apache.org/r/52631/diff/ > > > Testing > ------- > > The change effects only a unit test. > Ran the test many times locally. > Added random sleeps to the tasks to simulate the delay when they start or > complete and ran the test many times to see if the synch between the tasks is > working correctly. > > > Thanks, > > Marta Kuczora > >