[ https://issues.apache.org/jira/browse/OOZIE-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rentao Wu updated OOZIE-3624: ----------------------------- Description: When the yarn cluster which is used by a Oozie scheduled workflow gets recreated with a new cluster, future runs of the scheduled workflow will break as they depend on the workflow/ job.properties files which was deployed on hdfs. The yarn jobtracker will also no longer work due to: {noformat} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1622844783178_0004_000002 not found in AMRMTokenSecretManager. {noformat} It seem there are some tokens store in yarn and when the yarn cluster gets terminated and replaced with a new yarn cluster. The oozie launcher will hit this error message. This invalid token message also happen when I configure oozie to use a remote yarn cluster. The yarn cluster getting recreated is a common case in cloud, I'm wondering is there a way for oozie to be resilient to the underlying yarn cluster changing? Also is it supported for workflow/coordinator/ job.properties files to be deployed on s3 instead of hdfs? was: When the yarn cluster which is used by a Oozie scheduled workflow gets recreated with a new cluster, future runs of the scheduled workflow will break as they depend on the workflow/ job.properties files which was deployed on hdfs. The yarn jobtracker will also no longer work due to: {noformat} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1622844783178_0004_000002 not found in AMRMTokenSecretManager. {noformat} It seem there are some tokens store in yarn and when the yarn cluster gets terminated and replaced with a new yarn cluster. The oozie launcher will hit this error message. The yarn cluster getting recreated is a common case in cloud, I'm wondering is there a way for oozie to be resilient to the underlying yarn cluster being ephemeral? is it supported for workflow/coordinator/ job.properties files to be deployed on s3 instead of hdfs? > Oozie scheduled workflows fail when yarn/hdfs cluster changes > ------------------------------------------------------------- > > Key: OOZIE-3624 > URL: https://issues.apache.org/jira/browse/OOZIE-3624 > Project: Oozie > Issue Type: Improvement > Components: coordinator, workflow > Affects Versions: 5.2.0 > Reporter: Rentao Wu > Priority: Major > > When the yarn cluster which is used by a Oozie scheduled workflow gets > recreated with a new cluster, future runs of the scheduled workflow will > break as they depend on the workflow/ job.properties files which was deployed > on hdfs. > > The yarn jobtracker will also no longer work due to: > > > {noformat} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > appattempt_1622844783178_0004_000002 not found in AMRMTokenSecretManager. > > {noformat} > > It seem there are some tokens store in yarn and when the yarn cluster gets > terminated and replaced with a new yarn cluster. The oozie launcher will hit > this error message. This invalid token message also happen when I configure > oozie to use a remote yarn cluster. > The yarn cluster getting recreated is a common case in cloud, I'm wondering > is there a way for oozie to be resilient to the underlying yarn cluster > changing? > > Also is it supported for workflow/coordinator/ job.properties files to be > deployed on s3 instead of hdfs? > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)