[
https://issues.apache.org/jira/browse/OOZIE-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871685#comment-13871685
]
Shwetha G S commented on OOZIE-1622:
------------------------------------
{quote}
This looks good. But recovery service should issue coord materialize command if
coord is created and in prep.
{quote}
The issue is if materialization is stuck because of some issue (can be db
connection issue or anything else for that matter). Yes, the recovery service
will trigger another materialisation. But if materialisation doesn't proceed
and complete for whatever reason, the recovery service shouldn't go ahead and
create another coord. If it does create another coord, bundle will end up with
2 coords with the same name.
{quote}
Import only needed class, not everything.
{quote}
That's IDEA's optimisation.
{quote}
What are we testing here? I don't see any assert.
{quote}
I thought waitFor() will throw up if it times out, but it doesn't. Will add an
assert
> Multiple CoordSubmit for same bundle
> ------------------------------------
>
> Key: OOZIE-1622
> URL: https://issues.apache.org/jira/browse/OOZIE-1622
> Project: Oozie
> Issue Type: Bug
> Reporter: Shwetha G S
> Assignee: Shwetha G S
> Priority: Critical
> Attachments: OOZIE-1622.patch
>
>
> We saw a weird instance where multiple coords were created for same bundle id
> when the bundle was supposed to have just 1 coordinator. Here are the oozie
> logs:
> {noformat}
> 2013-11-19 09:09:46,473 INFO BundleStartXCommand:539 - USER[fetl] GROUP[-]
> TOKEN[] APP[<app name>] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-]
> Bundle 0484436-131016085136608-oozie-oozi-B is not in PREP status. It is in :
> RUNNING
> 2013-11-19 09:09:46,473 WARN BundleStartXCommand:542 - USER[fetl] GROUP[-]
> TOKEN[] APP[<app name>] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-]
> E1100: Command precondition does not hold before execution, [Bundle
> 0484436-131016085136608-oozie-oozi-B is not in PREP status. It is in :
> RUNNING], Error Code: E1100
> 2013-11-19 09:09:46,473 INFO CoordSubmitXCommand:539 - USER[-] GROUP[-]
> TOKEN[-] APP[-] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-] STARTED
> Coordinator Submit
> 2013-11-19 09:09:46,483 INFO CoordSubmitXCommand:539 - USER[-] GROUP[-]
> TOKEN[-] APP[-] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-]
> configDefault Doesn't exist
> 2013-11-19 09:09:46,515 INFO CoordSubmitXCommand:539 - USER[fetl] GROUP[-]
> TOKEN[] APP[<app name>] JOB[0484437-131016085136608-oozie-oozi-C] ACTION[-]
> ENDED Coordinator Submit jobId=0484437-131016085136608-oozie-oozi-C
> 2013-11-19 09:09:46,529 INFO BundleStatusUpdateXCommand:539 - USER[fetl]
> GROUP[-] TOKEN[] APP[<app name>] JOB[0484437-131016085136608-oozie-oozi-C]
> ACTION[-] Updated bundle action [0484436-131016085136608-oozie-oozi-B_<app
> name>] from prev status [PREP] to current coord status [PREP], and new bundle
> action pending [0]
> 2013-11-19 09:09:46,535 INFO CoordMaterializeTransitionXCommand:539 -
> USER[fetl] GROUP[-] TOKEN[] APP[<app name>]
> JOB[0484437-131016085136608-oozie-oozi-C] ACTION[-] materialize actions for
> tz=Coordinated Universal Time,
> 2013-11-19 09:09:54,590 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Set bundle job [0484436-131016085136608-oozie-oozi-B]
> status to 'RUNNING' from 'RUNNING'
> 2013-11-19 09:09:54,590 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Bundle job [0484436-131016085136608-oozie-oozi-B] Pending
> set to FALSE
> 2013-11-19 09:10:16,326 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:12:57,246 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Set bundle job [0484436-131016085136608-oozie-oozi-B]
> status to 'SUSPENDED' from 'SUSPENDED'
> 2013-11-19 09:12:57,246 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Bundle job [0484436-131016085136608-oozie-oozi-B] Pending
> set to TRUE
> 2013-11-19 09:13:16,410 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:16:16,446 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:17:00,913 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Set bundle job [0484436-131016085136608-oozie-oozi-B]
> status to 'RUNNING' from 'RUNNING'
> 2013-11-19 09:17:00,914 INFO StatusTransitService$StatusTransitRunnable:539
> - USER[-] GROUP[-] Bundle job [0484436-131016085136608-oozie-oozi-B] Pending
> set to TRUE
> 2013-11-19 09:19:16,490 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:22:16,907 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:25:17,086 INFO
> CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Job
> :0484437-131016085136608-oozie-oozi-C numWaitingActions : 0 MatThrottle : 60
> 2013-11-19 09:26:49,373 INFO CoordSubmitXCommand:539 - USER[-] GROUP[-]
> TOKEN[-] APP[-] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-] STARTED
> Coordinator Submit
> 2013-11-19 09:26:49,383 INFO CoordSubmitXCommand:539 - USER[-] GROUP[-]
> TOKEN[-] APP[-] JOB[0484436-131016085136608-oozie-oozi-B] ACTION[-]
> configDefault Doesn't exist
> 2013-11-19 09:26:49,438 INFO CoordSubmitXCommand:539 - USER[fetl] GROUP[-]
> TOKEN[] APP[<app name>] JOB[0484598-131016085136608-oozie-oozi-C] ACTION[-]
> ENDED Coordinator Submit jobId=0484598-131016085136608-oozie-oozi-C
> 2013-11-19 09:26:49,445 INFO BundleStatusUpdateXCommand:539 - USER[fetl]
> GROUP[-] TOKEN[] APP[<app name>] JOB[0484598-131016085136608-oozie-oozi-C]
> ACTION[-] Updated bundle action [0484436-131016085136608-oozie-oozi-B_<app
> name>] from prev status [PREP] to current coord status [PREP], and new bundle
> action pending [1]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)