Forgot to mention the software versions we are testing with: Oozie: 3.3.2-cdh4.6.0 Hadoop: 2.0.0-cdh4.6.0
On Tue, May 13, 2014 at 1:46 PM, Jialong Wu <[email protected]> wrote: > Hi all, > > We are observing some strange behaviors in Oozie running workflows under > YARN. Jobs are being launched properly from Oozie, but the workflow would > go into SUSPENDED state with the running action in START_MANUAL state after > about 20 minutes. The only error message I can find is from the Oozie UI > Action Info dialog box and as follows: > > Status: START_MANUL > Error Code: JA009 > Error Message: JA009: Unknown rpc kind RPC_WRITABLE > > We ran into this error when we were configuring Oozie to work with YARN, > and the cause was that Oozie was using the old clients to talk to YARN RM. > That was fixed by setting the correct CATALINA_BASE in oozie-env.sh. We > suspect that somehow Oozie is still using the old client to check the > status of a running job, but we couldn't figure out which configuration is > causing this to happen. > > Just to add some additional information regarding this issue. The workflow > only gets suspended when it runs over a certain time limit. Our observation > is about 20 minutes. Any workflow that completes under that time limit > doesn't have this issue. > > Have anyone run into this issue before ? Any pointers to how to debug this > issue is very much appreciated ! > > Cheers, > Jialong >
