[ https://issues.apache.org/jira/browse/OOZIE-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496634#comment-16496634 ]
Peter Bacsko edited comment on OOZIE-3271 at 5/31/18 2:32 PM: -------------------------------------------------------------- Not sure what the root cause here is exacly, but for some reason {{CoordActionInputCheckXCommand}} runs for a very long time. This time it was 2 minutes and it was holding the lock of the coordinator action. Therefore the rest of the {{XCommand}} calls did not achieve anything. {noformat} 07:14:51,788 [CallableQueue-1] DEBUG CoordActionInputCheckXCommand:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Execute command [coord_action_input] key [0000000-180529071448012-oozie-root-C] 07:14:51,788 [CallableQueue-1] DEBUG CoordActionInputCheckXCommand:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] [0000000-180529071448012-oozie-root-C@1]::ActionInputCheck:: Action is in WAITING state. 07:14:51,790 [CallableQueue-1] INFO CoordActionInputCheckXCommand:520 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] [0000000-180529071448012-oozie-root-C@1]::CoordActionInputCheck:: Missing deps: ${coord:latestRange(-5,0)}#${coord:latestRange(-5,0)}#${coord:latestRange(-5,0)}#${coord:latestRange(-5,0)}#${coord:latestRange(-5,0)}#${coord:latestRange(-5,0)} 07:14:51,792 [CallableQueue-1] DEBUG CoordInputLogicEvaluatorPhaseOne:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Data set [A] is unresolved set, will get resolved in phase two 07:14:51,792 [CallableQueue-1] DEBUG CoordInputLogicEvaluatorPhaseOne:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Data set [B] is unresolved set, will get resolved in phase two 07:14:51,792 [CallableQueue-1] DEBUG CoordInputLogicEvaluatorUtil:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Input logic expression for [(dependencyBuilder.input("A").build() && dependencyBuilder.input("B").build())] and evaluate result is [PHASE_TWO_EVALUATION] 07:14:51,793 [CallableQueue-1] DEBUG CoordActionInputCheckXCommand:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] [0000000-180529071448012-oozie-root-C@1]::ActionInputCheck:: Checking Latest/future ... (absolutely nothing is logged from this thread for 2 minutes) 07:16:57,191 [CallableQueue-1] DEBUG CoordInputLogicEvaluatorPhaseTwo:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Resolved status of Data set A with min -1 and wait -1 = false 07:16:57,191 [CallableQueue-1] DEBUG CoordInputLogicEvaluatorUtil:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Input logic expression for [(dependencyBuilder.input("A").build() && dependencyBuilder.input("B").build())] and evaluate result is [FALSE] ... 07:16:57,195 [CallableQueue-1] DEBUG CoordActionInputCheckXCommand:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Released lock for [0000000-180529071448012-oozie-root-C] in [coord_action_input] {noformat} was (Author: pbacsko): Not sure what the root cause here is exacly, but for some reason {{CoordActionInputCheckXCommand}} runs for a very long time. This time it was 2 minutes and it was holding the lock of the coordinator action. Therefore the rest of the {{XCommand}} calls did not achieve anything. {noformat} 07:14:51,788 [CallableQueue-1] DEBUG CoordActionInputCheckXCommand:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Execute command [coord_action_input] key [0000000-180529071448012-oozie-root-C] 07:14:51,788 [CallableQueue-1] DEBUG CoordActionInputCheckXCommand:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] [0000000-180529071448012-oozie-root-C@1]::ActionInputCheck:: Action is in WAITING state. 07:14:51,790 [CallableQueue-1] INFO CoordActionInputCheckXCommand:520 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] [0000000-180529071448012-oozie-root-C@1]::CoordActionInputCheck:: Missing deps: ${coord:latestRange(-5,0)}#${coord:latestRange(-5,0)}#${coord:latestRange(-5,0)}#${coord:latestRange(-5,0)}#${coord:latestRange(-5,0)}#${coord:latestRange(-5,0)} 07:14:51,792 [CallableQueue-1] DEBUG CoordInputLogicEvaluatorPhaseOne:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Data set [A] is unresolved set, will get resolved in phase two 07:14:51,792 [CallableQueue-1] DEBUG CoordInputLogicEvaluatorPhaseOne:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Data set [B] is unresolved set, will get resolved in phase two 07:14:51,792 [CallableQueue-1] DEBUG CoordInputLogicEvaluatorUtil:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Input logic expression for [(dependencyBuilder.input("A").build() && dependencyBuilder.input("B").build())] and evaluate result is [PHASE_TWO_EVALUATION] 07:14:51,793 [CallableQueue-1] DEBUG CoordActionInputCheckXCommand:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] [0000000-180529071448012-oozie-root-C@1]::ActionInputCheck:: Checking Latest/future ... 07:16:57,191 [CallableQueue-1] DEBUG CoordInputLogicEvaluatorPhaseTwo:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Resolved status of Data set A with min -1 and wait -1 = false 07:16:57,191 [CallableQueue-1] DEBUG CoordInputLogicEvaluatorUtil:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Input logic expression for [(dependencyBuilder.input("A").build() && dependencyBuilder.input("B").build())] and evaluate result is [FALSE] ... 07:16:57,195 [CallableQueue-1] DEBUG CoordActionInputCheckXCommand:526 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180529071448012-oozie-root-C] ACTION[0000000-180529071448012-oozie-root-C@1] Released lock for [0000000-180529071448012-oozie-root-C] in [coord_action_input] {noformat} > Flaky tests in TestCoordInputLogicPush class > -------------------------------------------- > > Key: OOZIE-3271 > URL: https://issues.apache.org/jira/browse/OOZIE-3271 > Project: Oozie > Issue Type: Sub-task > Reporter: Peter Bacsko > Priority: Major > > Running locally, two tests in {{TestCoordInputLogicPush}} failed: > {noformat} > testLatestRange(org.apache.oozie.coord.input.logic.TestCoordInputLogicPush) > Time elapsed: 132.734 s <<< FAILURE! > junit.framework.AssertionFailedError: Action status should not be waiting > at junit.framework.Assert.fail(Assert.java:57) > at junit.framework.Assert.assertTrue(Assert.java:22) > at junit.framework.Assert.assertFalse(Assert.java:39) > at junit.framework.TestCase.assertFalse(TestCase.java:210) > at > org.apache.oozie.coord.input.logic.TestCoordInputLogicPush.startCoordAction(TestCoordInputLogicPush.java:570) > at > org.apache.oozie.coord.input.logic.TestCoordInputLogicPush.testLatestRange(TestCoordInputLogicPush.java:255) > {noformat} > {noformat} > testLatestRangeComplex(org.apache.oozie.coord.input.logic.TestCoordInputLogicPush) > Time elapsed: 159.055 s <<< FAILURE! > junit.framework.AssertionFailedError: Action status should not be waiting > at junit.framework.Assert.fail(Assert.java:57) > at junit.framework.Assert.assertTrue(Assert.java:22) > at junit.framework.Assert.assertFalse(Assert.java:39) > at junit.framework.TestCase.assertFalse(TestCase.java:210) > at > org.apache.oozie.coord.input.logic.TestCoordInputLogicPush.startCoordAction(TestCoordInputLogicPush.java:570) > at > org.apache.oozie.coord.input.logic.TestCoordInputLogicPush.testLatestRangeComplex(TestCoordInputLogicPush.java:330) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)