[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16327655#comment-16327655
 ] 

Attila Sasvari commented on OOZIE-1401:
---------------------------------------

https://builds.apache.org/job/PreCommit-OOZIE-Build/323/console
{code:java}
+1 PATCH_APPLIES
+1 CLEAN
-1 RAW_PATCH_ANALYSIS
    +1 the patch does not introduce any @author tags
    +1 the patch does not introduce any tabs
    +1 the patch does not introduce any trailing spaces
    -1 the patch contains 2 line(s) longer than 132 characters
    +1 the patch adds/modifies 1 testcase(s)
+1 RAT
    +1 the patch does not seem to introduce new RAT warnings
+1 JAVADOC
    +1 the patch does not seem to introduce new Javadoc warnings
+1 COMPILE
    +1 HEAD compiles
    +1 patch compiles
    +1 the patch does not seem to introduce new javac warnings
+1 There are no new bugs found in total.
 +1 There are no new bugs found in [docs].
 +1 There are no new bugs found in [sharelib/distcp].
 +1 There are no new bugs found in [sharelib/hive].
 +1 There are no new bugs found in [sharelib/spark].
 +1 There are no new bugs found in [sharelib/hive2].
 +1 There are no new bugs found in [sharelib/hcatalog].
 +1 There are no new bugs found in [sharelib/streaming].
 +1 There are no new bugs found in [sharelib/pig].
 +1 There are no new bugs found in [sharelib/sqoop].
 +1 There are no new bugs found in [sharelib/oozie].
 +1 There are no new bugs found in [examples].
 +1 There are no new bugs found in [client].
 +1 There are no new bugs found in [core].
 +1 There are no new bugs found in [tools].
 +1 There are no new bugs found in [server].
+1 BACKWARDS_COMPATIBILITY
    +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient 
annotations
    +1 the patch does not modify JPA files
+1 TESTS
    Tests run: 2087
    Tests failed at first run:
TestJavaActionExecutor#testCredentialsSkip
    For the complete list of flaky tests, see TEST-SUMMARY-FULL files.
+1 DISTRO
    +1 distro tarball builds with the patch {code}

> PurgeCommand should purge the workflow jobs w/o end_time
> --------------------------------------------------------
>
>                 Key: OOZIE-1401
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1401
>             Project: Oozie
>          Issue Type: Sub-task
>          Components: bundle, coordinator, workflow
>    Affects Versions: trunk
>            Reporter: Mona Chitnis
>            Assignee: Attila Sasvari
>            Priority: Major
>             Fix For: 5.0.0b1
>
>         Attachments: OOZIE-1401-001.patch, OOZIE-1401.amend.003.patch, 
> amend-OOZIE-1401-001.patch, amend-OOZIE-1401-002.patch
>
>
> Currently, {{PurgeXCommand}} logic is not working with those workflow jobs 
> with {{end_time=null}}. This command needs to take care of those jobs as 
> well. This happens in the case of long stuck jobs after Hadoop restarts or DB 
> failures. It could be done by checking {{last_modified_time}} instead, if 
> {{end_time}} is not available.
> The current query:
> {code:sql}
> select w from WorkflowJobBean w where w.endTimestamp < :endTime
> {code}
> There is also an issue when:
> * there is a parent workflow that has its {{end_time}} set
> * is otherwise eligible for {{PurgeXCommand}}: {{end_time}} is older than 
> configured number of days, and has {{status}} either {{KILLED}}, or 
> {{FAILED}}, or {{SUCCEEDED}}
> * has a child workflow that has the {{parent_id}} set to the {{id}} of the 
> parent workflow
> * child workflow has its {{end_time = NULL}}
> In this case, 
> [*{{PurgeXCommand#fetchTerminatedWorkflow()}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/PurgeXCommand.java#L249]
>  throws a {{NullPointerException}} like this:
> {noformat}
> 2017-09-29 07:59:46,365 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Purging workflows of long running coordinators is turned on
> 2017-09-29 07:59:46,371 DEBUG org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Execute command [purge] key [null]
> 2017-09-29 07:59:46,371 INFO org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [1] days, 
> Coordinator Jobs older than [1] days, and Bundlejobs older than [1] days.
> 2017-09-29 07:59:46,375 ERROR org.apache.oozie.command.PurgeXCommand: 
> SERVER[host-10-17-101-90.coe.cloudera.com] USER[-] GROUP[-] TOKEN[-] APP[-] 
> JOB[-] ACTION[-] Exception, 
> java.lang.NullPointerException
>       at 
> org.apache.oozie.command.PurgeXCommand.fetchTerminatedWorkflow(PurgeXCommand.java:249)
>       at 
> org.apache.oozie.command.PurgeXCommand.processWorkflowsHelper(PurgeXCommand.java:227)
>       at 
> org.apache.oozie.command.PurgeXCommand.processWorkflows(PurgeXCommand.java:199)
>       at 
> org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:150)
>       at org.apache.oozie.command.PurgeXCommand.execute(PurgeXCommand.java:53)
>       at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>       at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to