[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2018-01-17 Thread Attila Sasvari (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328511#comment-16328511
 ] 

Attila Sasvari commented on OOZIE-1401:
---

committed to master

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
>Priority: Major
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch, OOZIE-1401.amend.003.patch, 
> amend-OOZIE-1401-001.patch, amend-OOZIE-1401-002.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2018-01-16 Thread Attila Sasvari (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327655#comment-16327655
 ] 

Attila Sasvari commented on OOZIE-1401:
---

https://builds.apache.org/job/PreCommit-OOZIE-Build/323/console
{code:java}
+1 PATCH_APPLIES
+1 CLEAN
-1 RAW_PATCH_ANALYSIS
+1 the patch does not introduce any @author tags
+1 the patch does not introduce any tabs
+1 the patch does not introduce any trailing spaces
-1 the patch contains 2 line(s) longer than 132 characters
+1 the patch adds/modifies 1 testcase(s)
+1 RAT
+1 the patch does not seem to introduce new RAT warnings
+1 JAVADOC
+1 the patch does not seem to introduce new Javadoc warnings
+1 COMPILE
+1 HEAD compiles
+1 patch compiles
+1 the patch does not seem to introduce new javac warnings
+1 There are no new bugs found in total.
 +1 There are no new bugs found in [docs].
 +1 There are no new bugs found in [sharelib/distcp].
 +1 There are no new bugs found in [sharelib/hive].
 +1 There are no new bugs found in [sharelib/spark].
 +1 There are no new bugs found in [sharelib/hive2].
 +1 There are no new bugs found in [sharelib/hcatalog].
 +1 There are no new bugs found in [sharelib/streaming].
 +1 There are no new bugs found in [sharelib/pig].
 +1 There are no new bugs found in [sharelib/sqoop].
 +1 There are no new bugs found in [sharelib/oozie].
 +1 There are no new bugs found in [examples].
 +1 There are no new bugs found in [client].
 +1 There are no new bugs found in [core].
 +1 There are no new bugs found in [tools].
 +1 There are no new bugs found in [server].
+1 BACKWARDS_COMPATIBILITY
+1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient 
annotations
+1 the patch does not modify JPA files
+1 TESTS
Tests run: 2087
Tests failed at first run:
TestJavaActionExecutor#testCredentialsSkip
For the complete list of flaky tests, see TEST-SUMMARY-FULL files.
+1 DISTRO
+1 distro tarball builds with the patch {code}

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
>Priority: Major
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch, OOZIE-1401.amend.003.patch, 
> amend-OOZIE-1401-001.patch, amend-OOZIE-1401-002.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> 

[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2018-01-16 Thread Attila Sasvari (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327346#comment-16327346
 ] 

Attila Sasvari commented on OOZIE-1401:
---

Thanks [~andras.piros] +1 (depending on jenkins - I manually kicked off a 
precommit build job)

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
>Priority: Major
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch, OOZIE-1401.amend.003.patch, 
> amend-OOZIE-1401-001.patch, amend-OOZIE-1401-002.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2018-01-16 Thread Andras Piros (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327003#comment-16327003
 ] 

Andras Piros commented on OOZIE-1401:
-

[~asasvari] please review amendment patch 003. Thanks!

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
>Priority: Major
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch, OOZIE-1401.amend.003.patch, 
> amend-OOZIE-1401-001.patch, amend-OOZIE-1401-002.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2018-01-09 Thread Andras Piros (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16318138#comment-16318138
 ] 

Andras Piros commented on OOZIE-1401:
-

Thanks for the amendment patch [~asasvari]! I believe the overall direction is 
OK. Can you please upload it also to ReviewBoard?

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch, amend-OOZIE-1401-001.patch, 
> amend-OOZIE-1401-002.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2018-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16315110#comment-16315110
 ] 

Hadoop QA commented on OOZIE-1401:
--

Testing JIRA OOZIE-1401

Cleaning local git workspace



{color:red}-1{color} Patch failed to apply to head of branch



> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch, amend-OOZIE-1401-001.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2018-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314418#comment-16314418
 ] 

Hadoop QA commented on OOZIE-1401:
--

Testing JIRA OOZIE-1401

Cleaning local git workspace

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch, amend-OOZIE-1401-001.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2018-01-05 Thread Sergey Svynarchuk (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312794#comment-16312794
 ] 

Sergey Svynarchuk commented on OOZIE-1401:
--

[~asasvari] I'll work on it

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2018-01-04 Thread Attila Sasvari (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311553#comment-16311553
 ] 

Attila Sasvari commented on OOZIE-1401:
---

[~vaifer] do you have some free capacity to work on and upload the amendment 
patch? If not, I can do that.

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2017-12-29 Thread Attila Sasvari (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306319#comment-16306319
 ] 

Attila Sasvari commented on OOZIE-1401:
---

[~vaifer] thanks for spotting and reporting this issue. You are right that the 
mentioned queries do not return lastModificationTime, and if end_time is null, 
then those workflows can't be purged as there is no information when they have 
finished (but at least the purge service does not throw an NPE). An amendment 
patch should be fine I believe.


> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2017-12-29 Thread Sergey Svinarchuk (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306288#comment-16306288
 ] 

Sergey Svinarchuk commented on OOZIE-1401:
--

This patch doesn't work because lastModificationTime always NULL. Need add 
lastModificationTime to GET_WORKFLOWS_BASIC_INFO_BY_PARENT_ID and 
GET_WORKFLOWS_BASIC_INFO_BY_COORD_PARENT_ID queries. I can create new patch for 
this ticket or open new issue.

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2017-11-06 Thread Attila Sasvari (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240070#comment-16240070
 ] 

Attila Sasvari commented on OOZIE-1401:
---

[~gezapeti] thanks for the review. Committed to master.

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-1401-001.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2017-11-06 Thread Attila Sasvari (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240063#comment-16240063
 ] 

Attila Sasvari commented on OOZIE-1401:
---

{{TestCoordMaterializeTransitionXCommand.testMaterizationLookup}} is known to 
be flaky - [OOZIE-2726 |https://issues.apache.org/jira/browse/OOZIE-2726]

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
> Attachments: OOZIE-1401-001.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2017-11-06 Thread Peter Cseh (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240062#comment-16240062
 ] 

Peter Cseh commented on OOZIE-1401:
---

+1 
Test is failing due OOZIE-2726.

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
> Attachments: OOZIE-1401-001.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2017-11-05 Thread Andras Piros (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239431#comment-16239431
 ] 

Andras Piros commented on OOZIE-1401:
-

Thanks [~asasvari] working on this! Assigned to you.

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Attila Sasvari
> Attachments: OOZIE-1401-001.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>   at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>   at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>   at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>   at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2017-11-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239325#comment-16239325
 ] 

Hadoop QA commented on OOZIE-1401:
--

Testing JIRA OOZIE-1401

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:green}+1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:green}+1{color} the patch does not introduce any line longer than 
132
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
.{color:red}WARNING{color}: the current HEAD has 77 Javadoc warning(s)
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1{color} There are no new bugs found in total.
. {color:green}+1{color} There are no new bugs found in [docs].
. {color:green}+1{color} There are no new bugs found in [examples].
. {color:green}+1{color} There are no new bugs found in [tools].
. {color:green}+1{color} There are no new bugs found in [core].
. {color:green}+1{color} There are no new bugs found in [server].
. {color:green}+1{color} There are no new bugs found in [client].
. {color:green}+1{color} There are no new bugs found in [sharelib/distcp].
. {color:green}+1{color} There are no new bugs found in [sharelib/oozie].
. {color:green}+1{color} There are no new bugs found in [sharelib/hcatalog].
. {color:green}+1{color} There are no new bugs found in [sharelib/streaming].
. {color:green}+1{color} There are no new bugs found in [sharelib/hive].
. {color:green}+1{color} There are no new bugs found in [sharelib/sqoop].
. {color:green}+1{color} There are no new bugs found in [sharelib/spark].
. {color:green}+1{color} There are no new bugs found in [sharelib/hive2].
. {color:green}+1{color} There are no new bugs found in [sharelib/pig].
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 2049
.Tests failed: 1
.Tests errors: 0

.The patch failed the following testcases:

.  
testMaterizationLookup(org.apache.oozie.command.coord.TestCoordMaterializeTransitionXCommand)

.Tests failing with errors:
.  

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}

{color:red}. There is at least one warning, please check{color}

The full output of the test-patch run is available at

. https://builds.apache.org/job/PreCommit-OOZIE-Build/172/

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Andras Piros
> Attachments: OOZIE-1401-001.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> 

[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2017-09-29 Thread Andras Piros (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16185946#comment-16185946
 ] 

Andras Piros commented on OOZIE-1401:
-

[~jaydeepvishwakarma] taking this issue over, I have to fix that.

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Jaydeep Vishwakarma
>
> Currently, Purge logic is not working with those workflow jobs with 
> end_time=null. This command needs to take care of those jobs as well. This 
> happens in the case of long stuck jobs after Hadoop restarts or DB failures. 
> It could be done by checking created_time if end_time is not available.
> The current query:
> select w from WorkflowJobBean w where w.endTimestamp < :endTime



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2017-09-27 Thread Andras Piros (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182527#comment-16182527
 ] 

Andras Piros commented on OOZIE-1401:
-

[~jaydeepvishwakarma] how about this JIRA, do you plan to work on this? If not, 
do you have a PoC code snippet already?

> PurgeCommand should purge the workflow jobs w/o end_time
> 
>
> Key: OOZIE-1401
> URL: https://issues.apache.org/jira/browse/OOZIE-1401
> Project: Oozie
>  Issue Type: Sub-task
>  Components: bundle, coordinator, workflow
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Jaydeep Vishwakarma
>
> Currently, Purge logic is not working with those workflow jobs with 
> end_time=null. This command needs to take care of those jobs as well. This 
> happens in the case of long stuck jobs after Hadoop restarts or DB failures. 
> It could be done by checking created_time if end_time is not available.
> The current query:
> select w from WorkflowJobBean w where w.endTimestamp < :endTime



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2014-04-13 Thread Jaydeep Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967775#comment-13967775
 ] 

Jaydeep Vishwakarma commented on OOZIE-1401:


Yes, I would like to work on PurgeService. Should we create separate jira for 
it?


 PurgeCommand should purge the workflow jobs w/o end_time
 

 Key: OOZIE-1401
 URL: https://issues.apache.org/jira/browse/OOZIE-1401
 Project: Oozie
  Issue Type: Sub-task
  Components: bundle, coordinator, workflow
Affects Versions: trunk
Reporter: Mona Chitnis
 Fix For: trunk


 Currently, Purge logic is not working with those workflow jobs with 
 end_time=null. This command needs to take care of those jobs as well. This 
 happens in the case of long stuck jobs after Hadoop restarts or DB failures. 
 It could be done by checking created_time if end_time is not available.
 The current query:
 select w from WorkflowJobBean w where w.endTimestamp  :endTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2014-04-11 Thread Jaydeep Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966365#comment-13966365
 ] 

Jaydeep Vishwakarma commented on OOZIE-1401:


[~chitnis],
I saw the code snippet for this. It first fetch all eligible workflows for 
deletion and than start removing one by one.
The way current code is written for purging work flow might not create issues 
when you have less count of workflow, But when you have more than a million 
work flow it will run very slow and create extra load on DB. I think all 
eligible workflows should be deleted by single query. 
Although I have small patch ready for this bug, Still I feel we should think 
other prospects as well. 

 PurgeCommand should purge the workflow jobs w/o end_time
 

 Key: OOZIE-1401
 URL: https://issues.apache.org/jira/browse/OOZIE-1401
 Project: Oozie
  Issue Type: Sub-task
  Components: bundle, coordinator, workflow
Affects Versions: trunk
Reporter: Mona Chitnis
 Fix For: trunk


 Currently, Purge logic is not working with those workflow jobs with 
 end_time=null. This command needs to take care of those jobs as well. This 
 happens in the case of long stuck jobs after Hadoop restarts or DB failures. 
 It could be done by checking created_time if end_time is not available.
 The current query:
 select w from WorkflowJobBean w where w.endTimestamp  :endTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2014-04-11 Thread Jaydeep Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966371#comment-13966371
 ] 

Jaydeep Vishwakarma commented on OOZIE-1401:


The query can be like 
{code}
delete from WorkflowJobBean w where w.endTimestamp  :endTime or w.endTimestamp 
is null;
{code}

 PurgeCommand should purge the workflow jobs w/o end_time
 

 Key: OOZIE-1401
 URL: https://issues.apache.org/jira/browse/OOZIE-1401
 Project: Oozie
  Issue Type: Sub-task
  Components: bundle, coordinator, workflow
Affects Versions: trunk
Reporter: Mona Chitnis
 Fix For: trunk


 Currently, Purge logic is not working with those workflow jobs with 
 end_time=null. This command needs to take care of those jobs as well. This 
 happens in the case of long stuck jobs after Hadoop restarts or DB failures. 
 It could be done by checking created_time if end_time is not available.
 The current query:
 select w from WorkflowJobBean w where w.endTimestamp  :endTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1401) PurgeCommand should purge the workflow jobs w/o end_time

2014-04-11 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966598#comment-13966598
 ] 

Rohini Palaniswamy commented on OOZIE-1401:
---

bq. But when you have more than a million work flow it will run very slow and 
create extra load on DB.
   Yes. PurgeService is very inefficient and should be rewritten. Please feel 
free to take up that work if you are interested. At Y! we use custom optimized 
PL/SQL scripts using Oracle cursors run as cron jobs to perform bulk deletion 
and have the PurgeService turned off. So have not focused on the PurgeService. 
With lot of 1 min jobs, you must be facing problems with this at Inmobi. You 
can even probably enhance it to clean up more frequently based on the frequency 
of the coordinator. i.e Something like keep only 10 days worth if frequency is 
1 min, 15 days worth if frequency is 5 min, 30 days worth if frequency is 1 
days, etc. Remember [~sriksun] talking about such a feature.

 PurgeCommand should purge the workflow jobs w/o end_time
 

 Key: OOZIE-1401
 URL: https://issues.apache.org/jira/browse/OOZIE-1401
 Project: Oozie
  Issue Type: Sub-task
  Components: bundle, coordinator, workflow
Affects Versions: trunk
Reporter: Mona Chitnis
 Fix For: trunk


 Currently, Purge logic is not working with those workflow jobs with 
 end_time=null. This command needs to take care of those jobs as well. This 
 happens in the case of long stuck jobs after Hadoop restarts or DB failures. 
 It could be done by checking created_time if end_time is not available.
 The current query:
 select w from WorkflowJobBean w where w.endTimestamp  :endTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)