[ 
https://issues.apache.org/jira/browse/OOZIE-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465136#comment-13465136
 ] 

Virag Kothari commented on OOZIE-994:
-------------------------------------

I think when the javadoc talks about multiple versions of Hadoop, it means the 
registered exception class may not be in the classpath for some version of 
hadoop
For. eg. in JavaActionExecutor.registorError, there is
{code}
 
registerError(org.apache.hadoop.hdfs.protocol.QuotaExceededException.class.getName(),
                    ActionExecutorException.ErrorType.NON_TRANSIENT, "JA004");
{code}
This class may not exist in some version of hadoop and it needs to be handled. 
That is what the javadoc is pointing to.

I believe the mapping of the className to the errorInfo is unique. As, it 
doesn't make sense to have something like
{code}
 registerError(ABC.class.getName(),
                    ActionExecutorException.ErrorType.NON_TRANSIENT, "JA004");
registerError(ABC.class.getName(),
                    ActionExecutorException.ErrorType.TRANSIENT, "JA007");

{code}
Even if there are multiple implementations of same class, the class name would 
be different. So, no need of using Set of error infos.

I would prefer that we go back to the same earlier approach that was working 
before even though it requires exception to be registered in correct order.




                
> ActionCheckXCommand does not handle failures properly
> -----------------------------------------------------
>
>                 Key: OOZIE-994
>                 URL: https://issues.apache.org/jira/browse/OOZIE-994
>             Project: Oozie
>          Issue Type: Bug
>          Components: workflow
>    Affects Versions: 3.2.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Robert Kanter
>            Priority: Critical
>             Fix For: trunk
>
>         Attachments: OOZIE-994.patch, OOZIE-994.patch, OOZIE-994.patch, 
> OOZIE-994.patch, OOZIE-994.patch
>
>
> If the JT restarts or dies and running jobs are lost or the JT is not 
> reachable, Oozie ActionCheckXCommand will never fail the workflow job.
> There seem to be 2 issues here:
> * convertException is not receiving the root cause exception anytmore, but 
> alway HadoopAccessorException wrapping the root cause exception. We should 
> modify the convertException to inspect the cause exception as well.
> * ActionCheckXCommand does not do the handle retry logic of 
> ActionStartXCommand.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to