Hello, I’ve been running two Oozie ssh-action workflows which are initiated via their own Java program. The two workflows have different names, but the steps in the workflows are identically named. The two workflows use different proxy users, the programs they run are in different directories and the data they access is different. Bash scripts that start each workflow are scheduled to start simultaneously on a Hadoop cluster using the “at” command. One job always runs successfully (not always the same job) and most of the time both jobs work. But often one job will fail.
There are two different failure scenarios when a failure occurs. Sometimes Oozie will track the failed job attempt and the status will be stuck in PREP. Other times, Oozie logging doesn’t track the attempted job even though the Java program calling oozie returns “[ERROR] [main] StartOozie - java.lang.reflect.UndeclaredThrowableException: Unknown exception in doAs” the same error as jobs that get stuck in PREP. The following stack trace is from a job that was tracked by Oozie and stuck in PREP. Oozie is currently using Derby. Oozie Verion = 4.2.0 Hadoop Version = 2.7.3 Derby Version = 10.10.1.1 Job Log: ERROR StartXCommand:517 - SERVER[XXXX] USER[XXXX] GROUP[-] TOKEN[] APP[ssh-wf-XXX] JOB[0000001-180318120929200-oozie-oozi-W] ACTION[] XException, org.apache.oozie.command.CommandException: E0603: SQL error in operation, <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.RollbackException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:448) at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:82) at org.apache.oozie.command.XCommand.call(XCommand.java:287) at org.apache.oozie.DagEngine.start(DagEngine.java:202) at org.apache.oozie.DagEngine.submitJob(DagEngine.java:116) at org.apache.oozie.servlet.V1JobsServlet.submitWorkflowJob(V1JobsServlet.java:192) at org.apache.oozie.servlet.V1JobsServlet.submitJob(V1JobsServlet.java:92) at org.apache.oozie.servlet.BaseJobsServlet.doPost(BaseJobsServlet.java:102) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at org.apache.oozie.servlet.JsonRestServlet.service(JsonRestServlet.java:304) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.AuthFilter$2.doFilter(AuthFilter.java:171) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:617) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:576) at org.apache.oozie.servlet.AuthFilter.doFilter(AuthFilter.java:176) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.HostnameFilter.doFilter(HostnameFilter.java:86) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.OozieXFrameOptionsFilter.doFilter(OozieXFrameOptionsFilter.java:48) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.OozieCSRFFilter.doFilter(OozieCSRFFilter.java:62) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:234) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:610) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:503) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.oozie.executor.jpa.JPAExecutorException: E0603: SQL error in operation, <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.RollbackException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.oozie.service.JPAService.executeBatchInsertUpdateDelete(JPAService.java:439) at org.apache.oozie.executor.jpa.BatchQueryExecutor.executeBatchInsertUpdateDelete(BatchQueryExecutor.java:132) at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:439) ... 37 more Caused by: <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.RollbackException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.openjpa.persistence.EntityManagerImpl.commit(EntityManagerImpl.java:594) at org.apache.oozie.service.JPAService.executeBatchInsertUpdateDelete(JPAService.java:435) ... 39 more Caused by: <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.EntityNotFoundException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.openjpa.kernel.BrokerImpl.newFlushException(BrokerImpl.java:2347) at org.apache.openjpa.kernel.BrokerImpl.flush(BrokerImpl.java:2184) at org.apache.openjpa.kernel.BrokerImpl.flushSafe(BrokerImpl.java:2082) at org.apache.openjpa.kernel.BrokerImpl.beforeCompletion(BrokerImpl.java:2000) at org.apache.openjpa.kernel.LocalManagedRuntime.commit(LocalManagedRuntime.java:81) at org.apache.openjpa.kernel.BrokerImpl.commit(BrokerImpl.java:1524) at org.apache.openjpa.kernel.DelegatingBroker.commit(DelegatingBroker.java:933) at org.apache.openjpa.persistence.EntityManagerImpl.commit(EntityManagerImpl.java:570) ... 40 more Caused by: <openjpa-2.2.2-r422266:1468616 nonfatal store error> org.apache.openjpa.persistence.EntityNotFoundException: The instance of type "class org.apache.oozie.WorkflowActionBean" with oid "0000001-180318120929200-oozie-oozi-W@:start:" no longer exists in the data store. This may mean that you deleted the instance in a separate transaction, but this context still has a cached version. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.openjpa.kernel.StateManagerImpl.loadFields(StateManagerImpl.java:3109) at org.apache.openjpa.kernel.StateManagerImpl.loadField(StateManagerImpl.java:3185) at org.apache.openjpa.kernel.StateManagerImpl.fetchStringField(StateManagerImpl.java:2474) at org.apache.openjpa.kernel.StateManagerImpl.fetchString(StateManagerImpl.java:2464) at org.apache.openjpa.jdbc.meta.strats.StringFieldStrategy.insert(StringFieldStrategy.java:105) at org.apache.openjpa.jdbc.meta.FieldMapping.insert(FieldMapping.java:623) at org.apache.openjpa.jdbc.kernel.AbstractUpdateManager.insert(AbstractUpdateManager.java:239) at org.apache.openjpa.jdbc.kernel.AbstractUpdateManager.populateRowManager(AbstractUpdateManager.java:166) at org.apache.openjpa.jdbc.kernel.AbstractUpdateManager.flush(AbstractUpdateManager.java:97) at org.apache.openjpa.jdbc.kernel.AbstractUpdateManager.flush(AbstractUpdateManager.java:78) at org.apache.openjpa.jdbc.kernel.JDBCStoreManager.flush(JDBCStoreManager.java:732) at org.apache.openjpa.kernel.DelegatingStoreManager.flush(DelegatingStoreManager.java:131) ... 47 more 2018-03-18 12:20:02,465 WARN V1JobsServlet:523 - SERVER[XXXX] USER[XXX] GROUP[-] TOKEN[] APP[ssh-wf-XXX] JOB[0000001-180318120929200-oozie-oozi-W] ACTION[] URL[POST http://XXX.XXX.XXX.XXX:11000/oozie/v2/jobs?action=start] error[E0603], E0603: SQL error in operation, <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.RollbackException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 org.apache.oozie.servlet.XServletException: E0603: SQL error in operation, <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.RollbackException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.oozie.servlet.V1JobsServlet.submitWorkflowJob(V1JobsServlet.java:197) at org.apache.oozie.servlet.V1JobsServlet.submitJob(V1JobsServlet.java:92) at org.apache.oozie.servlet.BaseJobsServlet.doPost(BaseJobsServlet.java:102) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at org.apache.oozie.servlet.JsonRestServlet.service(JsonRestServlet.java:304) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.AuthFilter$2.doFilter(AuthFilter.java:171) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:617) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:576) at org.apache.oozie.servlet.AuthFilter.doFilter(AuthFilter.java:176) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.HostnameFilter.doFilter(HostnameFilter.java:86) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.OozieXFrameOptionsFilter.doFilter(OozieXFrameOptionsFilter.java:48) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.OozieCSRFFilter.doFilter(OozieCSRFFilter.java:62) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:234) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:610) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:503) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.oozie.DagEngineException: E0603: SQL error in operation, <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.RollbackException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.oozie.DagEngine.start(DagEngine.java:205) at org.apache.oozie.DagEngine.submitJob(DagEngine.java:116) at org.apache.oozie.servlet.V1JobsServlet.submitWorkflowJob(V1JobsServlet.java:192) ... 32 more Caused by: org.apache.oozie.command.CommandException: E0603: SQL error in operation, <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.RollbackException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:448) at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:82) at org.apache.oozie.command.XCommand.call(XCommand.java:287) at org.apache.oozie.DagEngine.start(DagEngine.java:202) ... 34 more Caused by: org.apache.oozie.executor.jpa.JPAExecutorException: E0603: SQL error in operation, <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.RollbackException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.oozie.service.JPAService.executeBatchInsertUpdateDelete(JPAService.java:439) at org.apache.oozie.executor.jpa.BatchQueryExecutor.executeBatchInsertUpdateDelete(BatchQueryExecutor.java:132) at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:439) ... 37 more Caused by: <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.RollbackException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.openjpa.persistence.EntityManagerImpl.commit(EntityManagerImpl.java:594) at org.apache.oozie.service.JPAService.executeBatchInsertUpdateDelete(JPAService.java:435) ... 39 more Caused by: <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.EntityNotFoundException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.openjpa.kernel.BrokerImpl.newFlushException(BrokerImpl.java:2347) at org.apache.openjpa.kernel.BrokerImpl.flush(BrokerImpl.java:2184) at org.apache.openjpa.kernel.BrokerImpl.flushSafe(BrokerImpl.java:2082) at org.apache.openjpa.kernel.BrokerImpl.beforeCompletion(BrokerImpl.java:2000) at org.apache.openjpa.kernel.LocalManagedRuntime.commit(LocalManagedRuntime.java:81) at org.apache.openjpa.kernel.BrokerImpl.commit(BrokerImpl.java:1524) at org.apache.openjpa.kernel.DelegatingBroker.commit(DelegatingBroker.java:933) at org.apache.openjpa.persistence.EntityManagerImpl.commit(EntityManagerImpl.java:570) ... 40 more Caused by: <openjpa-2.2.2-r422266:1468616 nonfatal store error> org.apache.openjpa.persistence.EntityNotFoundException: The instance of type "class org.apache.oozie.WorkflowActionBean" with oid "0000001-180318120929200-oozie-oozi-W@:start:" no longer exists in the data store. This may mean that you deleted the instance in a separate transaction, but this context still has a cached version. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.openjpa.kernel.StateManagerImpl.loadFields(StateManagerImpl.java:3109) at org.apache.openjpa.kernel.StateManagerImpl.loadField(StateManagerImpl.java:3185) at org.apache.openjpa.kernel.StateManagerImpl.fetchStringField(StateManagerImpl.java:2474) at org.apache.openjpa.kernel.StateManagerImpl.fetchString(StateManagerImpl.java:2464) at org.apache.openjpa.jdbc.meta.strats.StringFieldStrategy.insert(StringFieldStrategy.java:105) at org.apache.openjpa.jdbc.meta.FieldMapping.insert(FieldMapping.java:623) at org.apache.openjpa.jdbc.kernel.AbstractUpdateManager.insert(AbstractUpdateManager.java:239) at org.apache.openjpa.jdbc.kernel.AbstractUpdateManager.populateRowManager(AbstractUpdateManager.java:166) at org.apache.openjpa.jdbc.kernel.AbstractUpdateManager.flush(AbstractUpdateManager.java:97) at org.apache.openjpa.jdbc.kernel.AbstractUpdateManager.flush(AbstractUpdateManager.java:78) at org.apache.openjpa.jdbc.kernel.JDBCStoreManager.flush(JDBCStoreManager.java:732) at org.apache.openjpa.kernel.DelegatingStoreManager.flush(DelegatingStoreManager.java:131) ... 47 more Output Log: ERROR security.UserGroupInformation: PriviledgedActionException as:XXX via XXX cause:E0603 : E0603 : SQL error in operation, <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.RollbackExcepti on: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject : org.apache.oozie.WorkflowActionBean@66ff6081 java.lang.reflect.UndeclaredThrowableException: Unknown exception in doAs at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1203) at com.dsi.calloozie.StartOozie.main(StartOozie.java:54) Caused by: java.security.PrivilegedActionException: E0603 : E0603: SQL error in operation, <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.persistence.RollbackException: The transaction has been rolled back. See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) ... 1 more Caused by: E0603 : E0603: SQL error in operation, <openjpa-2.2.2-r422266:1468616 fatal store error> org.apache.openjpa.per sistence.RollbackException: The transaction has been rolled back. See the nested exceptions for details on the errors tha t occurred. FailedObject: org.apache.oozie.WorkflowActionBean@66ff6081 at org.apache.oozie.client.OozieClient.handleError(OozieClient.java:542) at org.apache.oozie.client.OozieClient$JobSubmit.call(OozieClient.java:625) at org.apache.oozie.client.OozieClient$JobSubmit.call(OozieClient.java:595) at org.apache.oozie.client.OozieClient$ClientCallable.call(OozieClient.java:514) at org.apache.oozie.client.OozieClient.run(OozieClient.java:756) at com.dsi.calloozie.StartOozie$1.run(StartOozie.java:57) at com.dsi.calloozie.StartOozie$1.run(StartOozie.java:54) ... 4 more