[
https://issues.apache.org/jira/browse/OOZIE-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16646838#comment-16646838
]
Satish Subhashrao Saley commented on OOZIE-3365:
------------------------------------------------
The log looked like following -
I see reran of the workflow job.
{code}
2018-04-24 21:09:20,684 DEBUG ReRunXCommand:526 [qtp -85] - SERVER[localhost]
USER[saley] GROUP[-] TOKEN[-] APP[-] JOB[123-123-oozie-local-W] ACTION[]
Skipnode size are to rerun from FAIL nodes :4
2018-04-24 21:09:20,684 DEBUG ReRunXCommand:526 [qtp -85] - SERVER[localhost]
USER[saley] GROUP[-] TOKEN[-] APP[-] JOB[123-123-oozie-local-W] ACTION[]
SkipNode List :should_run,fail,transform,:start:,
2018-04-24 21:09:20,688 DEBUG ReRunXCommand:526 [qtp -85] - SERVER[localhost]
USER[saley] GROUP[-] TOKEN[-] APP[-] JOB[123-123-oozie-local-W] ACTION[]
Acquired lock for [123-123-oozie-local-W] in [rerun]
{code}
Then the action failed
{code}
2018-04-24 22:49:12,000 DEBUG EventHandlerService:526 [pool-1-thread-5] -
SERVER[localhost] USER[-] GROUP[-] TOKEN[-] APP[appname]
JOB[123-123-oozie-local-W] ACTION[123-123-oozie-local-W@appname] Processing
event : ID: 123-123-oozie-local-W@appname, MsgType:JOB, AppType:
WORKFLOW_ACTION, Appname: appname, Status: FAILURE
2018-04-24 22:49:12,001 DEBUG SLACalculatorMemory:526 [pool-1-thread-5] -
SERVER[localhost] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Received
addJobStatus request for job [123-123-oozie-local-W@appname] jobStatus =
[ERROR], jobEventStatus = [FAILURE], startTime = [4/24/18 9:09 PM], endTime =
[4/24/18 10:49 PM]
{code}
Workflow should be in failed state after that. But I see
CoordActionCheckXCommand checking the coord action status and finding workflow
status as running.
{code}
2018-04-24 21:50:41,024 WARN CoordActionCheckXCommand:523 [pool-12-thread-241]
- SERVER[localhost] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[123-123-oozie-local-C]
ACTION[123-123-oozie-local-C@896] Unexpected workflow 123-123-oozie-local-W
STATUS RUNNING
2018-04-24 21:50:41,026 DEBUG CoordActionCheckXCommand:526 [pool-12-thread-241]
- SERVER[localhost] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[123-123-oozie-local-C]
ACTION[123-123-oozie-local-C@896] Released lock for [123-123-oozie-local-C] in
[coord_action_check]
2018-04-24 21:56:41,202 DEBUG CoordActionCheckXCommand:526 [pool-12-thread-51]
- SERVER[localhost] USER[saley] GROUP[users] TOKEN[-] APP[app-pipeline]
JOB[123-123-oozie-local-C] ACTION[123-123-oozie-local-C@896] Acquired lock for
[123-123-oozie-local-C] in [coord_action_check]
2018-04-24 21:56:41,203 DEBUG CoordActionCheckXCommand:526 [pool-12-thread-51]
- SERVER[localhost] USER[saley] GROUP[users] TOKEN[-] APP[app-pipeline]
JOB[123-123-oozie-local-C] ACTION[123-123-oozie-local-C@896] Execute command
[coord_action_check] key [123-123-oozie-local-C]
2018-04-24 21:56:41,203 WARN CoordActionCheckXCommand:523 [pool-12-thread-51] -
SERVER[localhost] USER[saley] GROUP[users] TOKEN[-] APP[app-pipeline]
JOB[123-123-oozie-local-C] ACTION[123-123-oozie-local-C@896] Unexpected
workflow 123-123-oozie-local-W STATUS RUNNING
{code}
> Workflow and Coord Action status remains RUNNING after rerun
> ------------------------------------------------------------
>
> Key: OOZIE-3365
> URL: https://issues.apache.org/jira/browse/OOZIE-3365
> Project: Oozie
> Issue Type: Bug
> Reporter: Satish Subhashrao Saley
> Assignee: Satish Subhashrao Saley
> Priority: Major
>
> User reran a workflow job which had subworkflow action. Subworkflow action
> failed, but the status of Workflow and corresponding coord action was not
> updated from RUNNING to FAILED.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)