Murali Ramasami created FALCON-2134: ---------------------------------------
Summary: Fix testProcessInstanceKillKillNotRunning test case Key: FALCON-2134 URL: https://issues.apache.org/jira/browse/FALCON-2134 Project: Falcon Issue Type: Bug Components: merlin Affects Versions: trunk Reporter: Murali Ramasami Assignee: Murali Ramasami Fix For: trunk testProcessInstanceKillKillNotRunning failed since waiting instance got killed. Test Case Description: {noformat} * Schedule process. Provide data for all instances except the last ,thus making it non-materialized (waiting). * Try to -kill last 3 instances. * Check that only running instances were affected. {noformat} Test case Failure log: In the log , we can see "0000882-160722063010010-oozie-oozi-C@6" is in waiting but when the kill issued for last 3 running instances[ in this case 2 running and 1 waiting] , all the 3 instances are got killed [ including the instances in waiting state] . {noformat} 2016-07-22 10:18:10,030 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Try 20 of 50 (InstanceUtil:644) 2016-07-22 10:18:10,051 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Coordinator Action 0000882-160722063010010-oozie-oozi-C@1 status is RUNNING on oozie http://nat-r6-anms-falcon-2-5.openstacklocal:11000/oozie/ (InstanceUtil:654) 2016-07-22 10:18:10,051 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Coordinator Action 0000882-160722063010010-oozie-oozi-C@2 status is RUNNING on oozie http://nat-r6-anms-falcon-2-5.openstacklocal:11000/oozie/ (InstanceUtil:654) 2016-07-22 10:18:10,051 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Coordinator Action 0000882-160722063010010-oozie-oozi-C@3 status is RUNNING on oozie http://nat-r6-anms-falcon-2-5.openstacklocal:11000/oozie/ (InstanceUtil:654) 2016-07-22 10:18:10,051 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Coordinator Action 0000882-160722063010010-oozie-oozi-C@4 status is RUNNING on oozie http://nat-r6-anms-falcon-2-5.openstacklocal:11000/oozie/ (InstanceUtil:654) 2016-07-22 10:18:10,051 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Coordinator Action 0000882-160722063010010-oozie-oozi-C@5 status is RUNNING on oozie http://nat-r6-anms-falcon-2-5.openstacklocal:11000/oozie/ (InstanceUtil:654) 2016-07-22 10:18:10,051 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Coordinator Action 0000882-160722063010010-oozie-oozi-C@6 status is WAITING on oozie http://nat-r6-anms-falcon-2-5.openstacklocal:11000/oozie/ (InstanceUtil:654) 2016-07-22 10:18:10,051 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Request Url: http://nat-r6-anms-falcon-2-4.openstacklocal:15000/api/instance/kill/process/A73856b63-3afc1e07/?start=2010-01-02T00%3A14Z&end=2010-01-02T00%3A26Z&colo=*&user.name=hrt_qa (BaseRequest:175) 2016-07-22 10:18:10,051 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Request Method: POST (BaseRequest:176) 2016-07-22 10:18:10,051 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Request Header: Name=Cookie Value=hadoop.auth="u=hrt_qa&p=hrt...@example.com&t=kerberos&e=1469206397045&s=vmgc7OiuViE0So52XM+c/04XuOo=" (BaseRequest:179) 2016-07-22 10:18:10,658 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Response Status: HTTP/1.1 200 OK (BaseRequest:207) 2016-07-22 10:18:10,658 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Response Header: Name=Content-Type Value=application/json (BaseRequest:209) 2016-07-22 10:18:10,658 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Response Header: Name=Transfer-Encoding Value=chunked (BaseRequest:209) 2016-07-22 10:18:10,658 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Response Header: Name=Server Value=Jetty(6.1.26.hwx) (BaseRequest:209) 2016-07-22 10:18:10,659 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ The web service response is: { "status": "SUCCEEDED", "message": "default/KILL\n", "requestId": "default/1217343444@qtp-1597328335-86 - e730514a-f313-45fe-af87-69ca2bf5e92c\n", "instances": [ { "instance": "2010-01-02T00:25Z", "status": "KILLED", "cluster": "A73856b63-223dd32f", "runId": 0, "details": "hdfs://nat-r6-anms-falcon-2-7.openstacklocal:8020/tmp/falcon-regression/ProcessInstanceKillsTest/input/2010/01/02/00/25#hdfs://nat-r6-anms-falcon-2-7.openstacklocal:8020/tmp/falcon-regression/ProcessInstanceKillsTest/input/2010/01/02/00/20#hdfs://nat-r6-anms-falcon-2-7.openstacklocal:8020/tmp/falcon-regression/ProcessInstanceKillsTest/input/2010/01/02/00/15#hdfs://nat-r6-anms-falcon-2-7.openstacklocal:8020/tmp/falcon-regression/ProcessInstanceKillsTest/input/2010/01/02/00/10#hdfs://nat-r6-anms-falcon-2-7.openstacklocal:8020/tmp/falcon-regression/ProcessInstanceKillsTest/input/2010/01/02/00/05" }, { "instance": "2010-01-02T00:20Z", "status": "KILLED", "logFile": "http://nat-r6-anms-falcon-2-5.openstacklocal:11000/oozie/?job\u003d0000882-160722063010010-oozie-oozi-C@5", "cluster": "A73856b63-223dd32f", "startTime": "2016-07-22T10:18:07Z", "runId": 0, "details": "" }, { "instance": "2010-01-02T00:15Z", "status": "KILLED", "logFile": "http://nat-r6-anms-falcon-2-5.openstacklocal:11000/oozie/?job\u003d0000882-160722063010010-oozie-oozi-C@4", "cluster": "A73856b63-223dd32f", "startTime": "2016-07-22T10:18:07Z", "runId": 0, "details": "" } ] } (InstanceUtil:94) 2016-07-22 10:18:10,659 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ statusCode: 200 (InstanceUtil:136) 2016-07-22 10:18:10,659 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ message: default/KILL (InstanceUtil:137) 2016-07-22 10:18:10,661 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ APIResult.Status: SUCCEEDED (InstanceUtil:138) 2016-07-22 10:18:10,661 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ instances: [{instance:2010-01-02T00:25Z, status:KILLED, cluster:A73856b63-223dd32f} , {instance:2010-01-02T00:20Z, status:KILLED, log:http://nat-r6-anms-falcon-2-5.openstacklocal:11000/oozie/?job=0000882-160722063010010-oozie-oozi-C@5, cluster:A73856b63-223dd32f} , {instance:2010-01-02T00:15Z, status:KILLED, log:http://nat-r6-anms-falcon-2-5.openstacklocal:11000/oozie/?job=0000882-160722063010010-oozie-oozi-C@4, cluster:A73856b63-223dd32f} ] (InstanceUtil:271) 2016-07-22 10:18:10,661 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ status: KILLED, instance: 2010-01-02T00:25Z (InstanceUtil:277) 2016-07-22 10:18:10,661 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ status: KILLED, instance: 2010-01-02T00:20Z (InstanceUtil:277) 2016-07-22 10:18:10,661 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ status: KILLED, instance: 2010-01-02T00:15Z (InstanceUtil:277) 2016-07-22 10:18:10,661 INFO - [pool-34-thread-1:testProcessInstanceKillKillNotRunning] ~ Testing going to end for: org.apache.falcon.regression.ProcessInstanceKillsTest.testProcessInstanceKillKillNotRunning([]) ----- Status: FAILED (TestngListener:79) 2016-07-22 10:18:10,661 INFO - [pool-34-thread-1:] ~ ---------------------------------------------------------------------------------------------------- (TestngListener:83) 2016-07-22 10:18:10,662 INFO - [pool-34-thread-1:] ~ java.lang.AssertionError: Waiting Instances expected [1] but found [0] at org.testng.Assert.fail(Assert.java:94) at org.testng.Assert.failNotEquals(Assert.java:494) at org.testng.Assert.assertEquals(Assert.java:123) at org.testng.Assert.assertEquals(Assert.java:370) at org.apache.falcon.regression.core.util.InstanceUtil.validateResponse(InstanceUtil.java:284) at org.apache.falcon.regression.ProcessInstanceKillsTest.testProcessInstanceKillKillNotRunning(ProcessInstanceKillsTest.java:167) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84) at org.testng.internal.Invoker.invokeMethod(Invoker.java:714) at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901) at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231) at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127) at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) (TestngListener:116) {noformat} >From the doc it says, https://oozie.apache.org/docs/4.2.0/DG_CommandLineTool.html#Killing_a_Workflow_Coordinator_or_Bundle_Job Given a date range to kill instances of coord, Oozie kills all instances that are in non-terminal state. {noformat} Killing a Coordinator Action or Multiple Actions Example: $oozie job -kill <coord_Job_id> [-action 1, 3-4, 7-40] [-date 2009-01-01T01:00Z::2009-05-31T23:59Z, 2009-11-10T01:00Z, 2009-12-31T22:00Z] The kill option here for a range of coordinator actions kills a non-terminal (=RUNNING=, WAITING , READY , SUSPENDED ) coordinator action when coordinator job is not in FAILED or KILLED state. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)