[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-10-11 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Attachment: (was: OOZIE-2668-5.patch)

> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
> Attachments: OOZIE-2668-1.patch, OOZIE-2668-2.patch, 
> OOZIE-2668-4.patch, OOZIE-2668-5.patch, OOZIE-2688-3.patch
>
>
> In cases where workflow is already in terminal status (except failed) but the 
> coord action is not yet updated and still running, following will happen if a 
> kill command is issued on the coord job: 
> Kill on Coordjob will make the kill on coordaction pending until the children 
> are also killed. However, as the wf in terminal state (except failed), the wf 
> will not be killed and preverifycondition will fail. The wf doesn't update 
> its parent and hence the coordaction kill will still be pending. Two 
> problems: Status transit service will not resolve the state of this coord job 
> as some the actions are still pending Recovery service will try to recover 
> this killed coord action and keep on reissuing the kill command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-10-11 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Attachment: OOZIE-2668-5.patch

> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
> Attachments: OOZIE-2668-1.patch, OOZIE-2668-2.patch, 
> OOZIE-2668-4.patch, OOZIE-2668-5.patch, OOZIE-2688-3.patch
>
>
> In cases where workflow is already in terminal status (except failed) but the 
> coord action is not yet updated and still running, following will happen if a 
> kill command is issued on the coord job: 
> Kill on Coordjob will make the kill on coordaction pending until the children 
> are also killed. However, as the wf in terminal state (except failed), the wf 
> will not be killed and preverifycondition will fail. The wf doesn't update 
> its parent and hence the coordaction kill will still be pending. Two 
> problems: Status transit service will not resolve the state of this coord job 
> as some the actions are still pending Recovery service will try to recover 
> this killed coord action and keep on reissuing the kill command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-10-05 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Attachment: OOZIE-2668-5.patch

> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
> Attachments: OOZIE-2668-1.patch, OOZIE-2668-2.patch, 
> OOZIE-2668-4.patch, OOZIE-2668-5.patch, OOZIE-2688-3.patch
>
>
> In cases where workflow is already in terminal status (except failed) but the 
> coord action is not yet updated and still running, following will happen if a 
> kill command is issued on the coord job: 
> Kill on Coordjob will make the kill on coordaction pending until the children 
> are also killed. However, as the wf in terminal state (except failed), the wf 
> will not be killed and preverifycondition will fail. The wf doesn't update 
> its parent and hence the coordaction kill will still be pending. Two 
> problems: Status transit service will not resolve the state of this coord job 
> as some the actions are still pending Recovery service will try to recover 
> this killed coord action and keep on reissuing the kill command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-09-30 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Attachment: OOZIE-2668-4.patch

> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
> Attachments: OOZIE-2668-1.patch, OOZIE-2668-2.patch, 
> OOZIE-2668-4.patch, OOZIE-2688-3.patch
>
>
> In cases where workflow is already in terminal status (except failed) but the 
> coord action is not yet updated and still running, following will happen if a 
> kill command is issued on the coord job: 
> Kill on Coordjob will make the kill on coordaction pending until the children 
> are also killed. However, as the wf in terminal state (except failed), the wf 
> will not be killed and preverifycondition will fail. The wf doesn't update 
> its parent and hence the coordaction kill will still be pending. Two 
> problems: Status transit service will not resolve the state of this coord job 
> as some the actions are still pending Recovery service will try to recover 
> this killed coord action and keep on reissuing the kill command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-09-27 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Attachment: OOZIE-2688-3.patch

> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
> Attachments: OOZIE-2668-1.patch, OOZIE-2668-2.patch, 
> OOZIE-2688-3.patch
>
>
> In cases where workflow is already in terminal status (except failed) but the 
> coord action is not yet updated and still running, following will happen if a 
> kill command is issued on the coord job: 
> Kill on Coordjob will make the kill on coordaction pending until the children 
> are also killed. However, as the wf in terminal state (except failed), the wf 
> will not be killed and preverifycondition will fail. The wf doesn't update 
> its parent and hence the coordaction kill will still be pending. Two 
> problems: Status transit service will not resolve the state of this coord job 
> as some the actions are still pending Recovery service will try to recover 
> this killed coord action and keep on reissuing the kill command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-09-16 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Attachment: OOZIE-2668-2.patch

> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
> Attachments: OOZIE-2668-1.patch, OOZIE-2668-2.patch
>
>
> In cases where workflow is already in terminal status (except failed) but the 
> coord action is not yet updated and still running, following will happen if a 
> kill command is issued on the coord job: 
> Kill on Coordjob will make the kill on coordaction pending until the children 
> are also killed. However, as the wf in terminal state (except failed), the wf 
> will not be killed and preverifycondition will fail. The wf doesn't update 
> its parent and hence the coordaction kill will still be pending. Two 
> problems: Status transit service will not resolve the state of this coord job 
> as some the actions are still pending Recovery service will try to recover 
> this killed coord action and keep on reissuing the kill command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-09-16 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Attachment: (was: OOZIE-2668-2.patch)

> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
> Attachments: OOZIE-2668-1.patch, OOZIE-2668-2.patch
>
>
> In cases where workflow is already in terminal status (except failed) but the 
> coord action is not yet updated and still running, following will happen if a 
> kill command is issued on the coord job: 
> Kill on Coordjob will make the kill on coordaction pending until the children 
> are also killed. However, as the wf in terminal state (except failed), the wf 
> will not be killed and preverifycondition will fail. The wf doesn't update 
> its parent and hence the coordaction kill will still be pending. Two 
> problems: Status transit service will not resolve the state of this coord job 
> as some the actions are still pending Recovery service will try to recover 
> this killed coord action and keep on reissuing the kill command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-09-14 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Attachment: OOZIE-2668-2.patch

Added test case

> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
> Attachments: OOZIE-2668-1.patch, OOZIE-2668-2.patch
>
>
> In cases where workflow is already in terminal status (except failed) but the 
> coord action is not yet updated and still running, following will happen if a 
> kill command is issued on the coord job: 
> Kill on Coordjob will make the kill on coordaction pending until the children 
> are also killed. However, as the wf in terminal state (except failed), the wf 
> will not be killed and preverifycondition will fail. The wf doesn't update 
> its parent and hence the coordaction kill will still be pending. Two 
> problems: Status transit service will not resolve the state of this coord job 
> as some the actions are still pending Recovery service will try to recover 
> this killed coord action and keep on reissuing the kill command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-09-09 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Environment: (was: In cases where workflow is already in terminal 
status (except failed) but the coord action is not yet updated and still 
running, following will happen if a kill command is issued on the coord job:

Kill on Coordjob  will make the kill on coordaction pending until the children 
are also killed. However, as the wf in terminal state (except failed), the wf 
will not be killed and preverifycondition will fail. The wf doesn't update its 
parent and hence the coordaction kill will still be pending.

Two problems:
Status transit service will not resolve the state of this coord job as some the 
actions are still pending
Recovery service will try to recover this killed coord action and keep on 
reissuing the kill command.)

> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-09-09 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Attachment: OOZIE-2668-1.patch

> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
> Attachments: OOZIE-2668-1.patch
>
>
> In cases where workflow is already in terminal status (except failed) but the 
> coord action is not yet updated and still running, following will happen if a 
> kill command is issued on the coord job: 
> Kill on Coordjob will make the kill on coordaction pending until the children 
> are also killed. However, as the wf in terminal state (except failed), the wf 
> will not be killed and preverifycondition will fail. The wf doesn't update 
> its parent and hence the coordaction kill will still be pending. Two 
> problems: Status transit service will not resolve the state of this coord job 
> as some the actions are still pending Recovery service will try to recover 
> this killed coord action and keep on reissuing the kill command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-09-09 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Description: 
In cases where workflow is already in terminal status (except failed) but the 
coord action is not yet updated and still running, following will happen if a 
kill command is issued on the coord job: Kill on Coordjob will make the kill on 
coordaction pending until the children are also killed. However, as the wf in 
terminal state (except failed), the wf will not be killed and 
preverifycondition will fail. The wf doesn't update its parent and hence the 
coordaction kill will still be pending. Two problems: Status transit service 
will not resolve the state of this coord job as some the actions are still 
pending Recovery service will try to recover this killed coord action and keep 
on reissuing the kill command.


> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
>
> In cases where workflow is already in terminal status (except failed) but the 
> coord action is not yet updated and still running, following will happen if a 
> kill command is issued on the coord job: Kill on Coordjob will make the kill 
> on coordaction pending until the children are also killed. However, as the wf 
> in terminal state (except failed), the wf will not be killed and 
> preverifycondition will fail. The wf doesn't update its parent and hence the 
> coordaction kill will still be pending. Two problems: Status transit service 
> will not resolve the state of this coord job as some the actions are still 
> pending Recovery service will try to recover this killed coord action and 
> keep on reissuing the kill command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2668) Status update and recovery problems when coord action and its children not in sync

2016-09-09 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated OOZIE-2668:
---
Description: 
In cases where workflow is already in terminal status (except failed) but the 
coord action is not yet updated and still running, following will happen if a 
kill command is issued on the coord job: 

Kill on Coordjob will make the kill on coordaction pending until the children 
are also killed. However, as the wf in terminal state (except failed), the wf 
will not be killed and preverifycondition will fail. The wf doesn't update its 
parent and hence the coordaction kill will still be pending. Two problems: 
Status transit service will not resolve the state of this coord job as some the 
actions are still pending Recovery service will try to recover this killed 
coord action and keep on reissuing the kill command.


  was:
In cases where workflow is already in terminal status (except failed) but the 
coord action is not yet updated and still running, following will happen if a 
kill command is issued on the coord job: Kill on Coordjob will make the kill on 
coordaction pending until the children are also killed. However, as the wf in 
terminal state (except failed), the wf will not be killed and 
preverifycondition will fail. The wf doesn't update its parent and hence the 
coordaction kill will still be pending. Two problems: Status transit service 
will not resolve the state of this coord job as some the actions are still 
pending Recovery service will try to recover this killed coord action and keep 
on reissuing the kill command.



> Status update and recovery problems when coord action and its children not in 
> sync
> --
>
> Key: OOZIE-2668
> URL: https://issues.apache.org/jira/browse/OOZIE-2668
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
>
> In cases where workflow is already in terminal status (except failed) but the 
> coord action is not yet updated and still running, following will happen if a 
> kill command is issued on the coord job: 
> Kill on Coordjob will make the kill on coordaction pending until the children 
> are also killed. However, as the wf in terminal state (except failed), the wf 
> will not be killed and preverifycondition will fail. The wf doesn't update 
> its parent and hence the coordaction kill will still be pending. Two 
> problems: Status transit service will not resolve the state of this coord job 
> as some the actions are still pending Recovery service will try to recover 
> this killed coord action and keep on reissuing the kill command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)