[ 
https://issues.apache.org/jira/browse/OOZIE-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16646838#comment-16646838
 ] 

Satish Subhashrao Saley commented on OOZIE-3365:
------------------------------------------------

The log looked like following - 
I see reran of the workflow job.

{code}
2018-04-24 21:09:20,684 DEBUG ReRunXCommand:526 [qtp -85] - SERVER[localhost] 
USER[saley] GROUP[-] TOKEN[-] APP[-] JOB[123-123-oozie-local-W] ACTION[] 
Skipnode size are to rerun from FAIL nodes :4
2018-04-24 21:09:20,684 DEBUG ReRunXCommand:526 [qtp -85] - SERVER[localhost] 
USER[saley] GROUP[-] TOKEN[-] APP[-] JOB[123-123-oozie-local-W] ACTION[] 
SkipNode List :should_run,fail,transform,:start:,
2018-04-24 21:09:20,688 DEBUG ReRunXCommand:526 [qtp -85] - SERVER[localhost] 
USER[saley] GROUP[-] TOKEN[-] APP[-] JOB[123-123-oozie-local-W] ACTION[] 
Acquired lock for [123-123-oozie-local-W] in [rerun]
{code}

Then the action failed

{code}
2018-04-24 22:49:12,000 DEBUG EventHandlerService:526 [pool-1-thread-5] - 
SERVER[localhost] USER[-] GROUP[-] TOKEN[-] APP[appname] 
JOB[123-123-oozie-local-W] ACTION[123-123-oozie-local-W@appname] Processing 
event : ID: 123-123-oozie-local-W@appname, MsgType:JOB, AppType: 
WORKFLOW_ACTION, Appname: appname, Status: FAILURE
2018-04-24 22:49:12,001 DEBUG SLACalculatorMemory:526 [pool-1-thread-5] - 
SERVER[localhost] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Received 
addJobStatus request for job [123-123-oozie-local-W@appname] jobStatus = 
[ERROR], jobEventStatus = [FAILURE], startTime = [4/24/18 9:09 PM], endTime = 
[4/24/18 10:49 PM] 
{code}

Workflow should be in failed state after that. But I see 
CoordActionCheckXCommand checking the coord action status and finding workflow 
status as running.

{code}
2018-04-24 21:50:41,024 WARN CoordActionCheckXCommand:523 [pool-12-thread-241] 
- SERVER[localhost] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[123-123-oozie-local-C] 
ACTION[123-123-oozie-local-C@896] Unexpected workflow 123-123-oozie-local-W 
STATUS RUNNING
2018-04-24 21:50:41,026 DEBUG CoordActionCheckXCommand:526 [pool-12-thread-241] 
- SERVER[localhost] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[123-123-oozie-local-C] 
ACTION[123-123-oozie-local-C@896] Released lock for [123-123-oozie-local-C] in 
[coord_action_check]
2018-04-24 21:56:41,202 DEBUG CoordActionCheckXCommand:526 [pool-12-thread-51] 
- SERVER[localhost] USER[saley] GROUP[users] TOKEN[-] APP[app-pipeline] 
JOB[123-123-oozie-local-C] ACTION[123-123-oozie-local-C@896] Acquired lock for 
[123-123-oozie-local-C] in [coord_action_check]
2018-04-24 21:56:41,203 DEBUG CoordActionCheckXCommand:526 [pool-12-thread-51] 
- SERVER[localhost] USER[saley] GROUP[users] TOKEN[-] APP[app-pipeline] 
JOB[123-123-oozie-local-C] ACTION[123-123-oozie-local-C@896] Execute command 
[coord_action_check] key [123-123-oozie-local-C]
2018-04-24 21:56:41,203 WARN CoordActionCheckXCommand:523 [pool-12-thread-51] - 
SERVER[localhost] USER[saley] GROUP[users] TOKEN[-] APP[app-pipeline] 
JOB[123-123-oozie-local-C] ACTION[123-123-oozie-local-C@896] Unexpected 
workflow 123-123-oozie-local-W STATUS RUNNING
{code}

 

> Workflow and Coord Action status remains RUNNING after rerun
> ------------------------------------------------------------
>
>                 Key: OOZIE-3365
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3365
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Satish Subhashrao Saley
>            Assignee: Satish Subhashrao Saley
>            Priority: Major
>
> User reran a workflow job which had subworkflow action. Subworkflow action 
> failed, but the status of Workflow and corresponding coord action was not 
> updated from RUNNING to FAILED.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to