Hi,
In the Oozie document:
https://oozie.apache.org/docs/5.1.0/WorkflowFunctionalSpec.html#a18_User-Retry_for_Workflow_Actions_since_Oozie_3.1,
it listed the oozie action max retries as "User-Retry allows user to give
certain number of reties (must not exceed system max retries)", and I assume
the system max retries is defined as "oozie.action.retries.max" with default
value of 3, defined in document
https://oozie.apache.org/docs/5.1.0/WorkflowFunctionalSpec.html#a18_User-Retry_for_Workflow_Actions_since_Oozie_3.1
But when I changed that value on AWS EMR 5.28.1, shown below:
[hadoop@ip-10-51-51-37 ~]$ oozie admin -version
Oozie server build version:
{"build.version":"5.1.0","vc.url":"https:\/\/git-wip-us.apache.org\/repos\/asf\/oozie.git","vc.revision":"branch-5.1@352b76eb","build.time":"2019.12.14-10:37:29GMT","build.user":"ec2-user"}
[hadoop@ip-10-51-51-37 ~]$ oozie admin -configuration | grep retries
oozie.action.retries.max : 12
oozie.action.ssh.check.retries.max : 3
oozie.service.CallbackService.early.requeue.max.retries : 5
oozie.service.JPAService.retry.max-retries : 10
oozie.zookeeper.max.retries : 10
In our test for the action retries as "<action name="SparkAction"
retry-max="10" retry-interval="1">", we observed the retries still ONLY
happened 3 times with 1 minute interval, then Oozie workflow will go to failure
step.
Any idea why? In one of our business case, we want to retry more than 3 times
with some interval as we defined, then go to the failure step.
Thanks