[
https://issues.apache.org/jira/browse/OOZIE-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871081#comment-13871081
]
purshotam shah commented on OOZIE-1622:
---------------------------------------
Comment on patch.
- if (bundleJob != null) {
+ if (bundleJob != null && baction.getCoordId() ==
null) {
This looks good. But recovery service should issue coord materialize command if
coord is created and in prep.
-import org.apache.oozie.client.CoordinatorAction;
-import org.apache.oozie.client.CoordinatorJob;
-import org.apache.oozie.client.OozieClient;
-import org.apache.oozie.client.WorkflowAction;
-import org.apache.oozie.client.WorkflowJob;
+import org.apache.oozie.client.*;
Import only needed class, not everything.
+ public void testBundleRecoveryCoordCreate() throws Exception {
+ CoordinatorStore store =
Services.get().get(StoreService.class).getStore(CoordinatorStore.class);
+ final BundleActionBean bundleAction;
+ final BundleJobBean bundle;
+ store.beginTrx();
+ try {
+ bundle = addRecordToBundleJobTable(Job.Status.RUNNING, false);
+ bundleAction = addRecordToBundleActionTable(bundle.getId(),
"coord1", 1, Job.Status.PREP);
+ store.commitTrx();
+ }
+ finally {
+ store.closeTrx();
+ }
+ final JPAService jpaService = Services.get().get(JPAService.class);
+
+ sleep(3000);
+ Runnable recoveryRunnable = new RecoveryRunnable(0, 1,1);
+ recoveryRunnable.run();
+
+ waitFor(10000, new Predicate() {
+ public boolean evaluate() throws Exception {
+ BundleActionBean mybundleAction =
+ jpaService.execute(new
BundleActionGetJPAExecutor(bundle.getId(), "coord1"));
+ try {
+ if (mybundleAction.getCoordId() != null) {
+ CoordinatorJobBean coord = jpaService.execute(new
CoordJobGetJPAExecutor(mybundleAction.getCoordId()));
+ return true;
+ }
+ } catch (Exception e) {
+ }
+ return false;
+ }
+ });
+ }
What are we testing here? I don't see any assert.
> Multiple CoordSubmit for same bundle
> ------------------------------------
>
> Key: OOZIE-1622
> URL: https://issues.apache.org/jira/browse/OOZIE-1622
> Project: Oozie
> Issue Type: Bug
> Reporter: Shwetha G S
> Assignee: Shwetha G S
> Priority: Critical
> Attachments: OOZIE-1622.patch
>
>
> We saw a weird instance where multiple coords were created for same bundle id
> when the bundle was supposed to have just 1 coordinator. Here are the oozie
> logs:
> {noformat}
> 2013-11-19 09:09:46,473 INFO BundleStartXCommand:539 - USER[fetl] GROUP[-]
> TOKEN[] APP[<app name>] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-]
> Bundle 0484436-131016085136608-oozie-oozi-B is not in PREP status. It is in :
> RUNNING
> 2013-11-19 09:09:46,473 WARN BundleStartXCommand:542 - USER[fetl] GROUP[-]
> TOKEN[] APP[<app name>] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-]
> E1100: Command precondition does not hold before execution, [Bundle
> 0484436-131016085136608-oozie-oozi-B is not in PREP status. It is in :
> RUNNING], Error Code: E1100
> 2013-11-19 09:09:46,473 INFO CoordSubmitXCommand:539 - USER[-] GROUP[-]
> TOKEN[-] APP[-] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-] STARTED
> Coordinator Submit
> 2013-11-19 09:09:46,483 INFO CoordSubmitXCommand:539 - USER[-] GROUP[-]
> TOKEN[-] APP[-] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-]
> configDefault Doesn't exist
> 2013-11-19 09:09:46,515 INFO CoordSubmitXCommand:539 - USER[fetl] GROUP[-]
> TOKEN[] APP[<app name>] JOB[0484437-131016085136608-oozie-oozi-C] ACTION[-]
> ENDED Coordinator Submit jobId=0484437-131016085136608-oozie-oozi-C
> 2013-11-19 09:09:46,529 INFO BundleStatusUpdateXCommand:539 - USER[fetl]
> GROUP[-] TOKEN[] APP[<app name>] JOB[0484437-131016085136608-oozie-oozi-C]
> ACTION[-] Updated bundle action [0484436-131016085136608-oozie-oozi-B_<app
> name>] from prev status [PREP] to current coord status [PREP], and new bundle
> action pending [0]
> 2013-11-19 09:09:46,535 INFO CoordMaterializeTransitionXCommand:539 -
> USER[fetl] GROUP[-] TOKEN[] APP[<app name>]
> JOB[0484437-131016085136608-oozie-oozi-C] ACTION[-] materialize actions for
> tz=Coordinated Universal Time,
> 2013-11-19 09:09:54,590 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Set bundle job [0484436-131016085136608-oozie-oozi-B]
> status to 'RUNNING' from 'RUNNING'
> 2013-11-19 09:09:54,590 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Bundle job [0484436-131016085136608-oozie-oozi-B] Pending
> set to FALSE
> 2013-11-19 09:10:16,326 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:12:57,246 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Set bundle job [0484436-131016085136608-oozie-oozi-B]
> status to 'SUSPENDED' from 'SUSPENDED'
> 2013-11-19 09:12:57,246 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Bundle job [0484436-131016085136608-oozie-oozi-B] Pending
> set to TRUE
> 2013-11-19 09:13:16,410 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:16:16,446 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:17:00,913 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Set bundle job [0484436-131016085136608-oozie-oozi-B]
> status to 'RUNNING' from 'RUNNING'
> 2013-11-19 09:17:00,914 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Bundle job [0484436-131016085136608-oozie-oozi-B] Pending
> set to TRUE
> 2013-11-19 09:19:16,490 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:22:16,907 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:25:17,086 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:26:49,373 INFO CoordSubmitXCommand:539 - USER[-] GROUP[-]
> TOKEN[-] APP[-] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-] STARTED
> Coordinator Submit
> 2013-11-19 09:26:49,383 INFO CoordSubmitXCommand:539 - USER[-] GROUP[-]
> TOKEN[-] APP[-] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-]
> configDefault Doesn't exist
> 2013-11-19 09:26:49,438 INFO CoordSubmitXCommand:539 - USER[fetl] GROUP[-]
> TOKEN[] APP[<app name>] JOB[0484598-131016085136608-oozie-oozi-C] ACTION[-]
> ENDED Coordinator Submit jobId=0484598-131016085136608-oozie-oozi-C
> 2013-11-19 09:26:49,445 INFO BundleStatusUpdateXCommand:539 - USER[fetl]
> GROUP[-] TOKEN[] APP[<app name>] JOB[0484598-131016085136608-oozie-oozi-C]
> ACTION[-] Updated bundle action [0484436-131016085136608-oozie-oozi-B_<app
> name>] from prev status [PREP] to current coord status [PREP], and new bundle
> action pending [1]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)