Hi,

I use oozie 4.0.0 in HA mode. The problem is there is one time that we
restart mysql, it just takes about 30 seconds, then I found both oozie
server log has exception like:
2014-11-12 06:39:34,603  WARN JPAService:542 - USER[-] GROUP[-] TOKEN[-]
APP[-] JOB[-] ACTION[-] JPAExecutor [WorkflowActionsGetPendingJPAExecutor]
ended with an active transaction, rolling back
2014-11-12 06:39:34,604  WARN RecoveryService$RecoveryRunnable:542 -
USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Exception while reading
pending actions from storage
org.apache.oozie.executor.jpa.JPAExecutorException: E0603: SQL error in
operation, <openjpa-2.2.2-r422266:1468616 fatal general error>
org.apache.openjpa.persistence.PersistenceException: Communications link
failure

The last packet successfully received from the server was 14,288
milliseconds ago.  The last packet sent successfully to the server was 0
milliseconds ago.
        at org.apache.oozie.service.JPAService.execute(JPAService.java:218)
        at
org.apache.oozie.service.RecoveryService$RecoveryRunnable.runWFRecovery(RecoveryService.java:326)
        at
org.apache.oozie.service.RecoveryService$RecoveryRunnable.run(RecoveryService.java:134)
        at
org.apache.oozie.service.SchedulerService$2.run(SchedulerService.java:174)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: <openjpa-2.2.2-r422266:1468616 fatal general error>
org.apache.openjpa.persistence.PersistenceException: Communications link
failure

The last packet successfully received from the server was 14,288
milliseconds ago.  The last packet sent successfully to the server was 0
milliseconds ago.
        at
org.apache.openjpa.jdbc.sql.DBDictionary.narrow(DBDictionary.java:4962)
        at
org.apache.openjpa.jdbc.sql.DBDictionary.newStoreException(DBDictionary.java:4922)
        at
org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:136)
        at
org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:110)
        at
org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:62)
        at
org.apache.openjpa.jdbc.kernel.JDBCStoreManager.connect(JDBCStoreManager.java:968)
        at
org.apache.openjpa.jdbc.kernel.JDBCStoreManager.getConnection(JDBCStoreManager.java:242)
        at
org.apache.openjpa.jdbc.kernel.PreparedSQLStoreQuery$PreparedSQLExecutor.executeQuery(PreparedSQLStoreQuery.java:104)
        at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:1005)
        at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:863)
        at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:794)
        at
org.apache.openjpa.kernel.DelegatingQuery.execute(DelegatingQuery.java:542)
        at
org.apache.openjpa.persistence.QueryImpl.execute(QueryImpl.java:286)
        at
org.apache.openjpa.persistence.QueryImpl.getResultList(QueryImpl.java:302)
        at
org.apache.oozie.executor.jpa.WorkflowActionsGetPendingJPAExecutor.execute(WorkflowActionsGetPendingJPAExecutor.java:60)
        at
org.apache.oozie.executor.jpa.WorkflowActionsGetPendingJPAExecutor.execute(WorkflowActionsGetPendingJPAExecutor.java:30)
        at org.apache.oozie.service.JPAService.execute(JPAService.java:207)
        ... 12 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException:
Communications link failure

The last packet successfully received from the server was 14,288
milliseconds ago.  The last packet sent successfully to the server was 0
milliseconds ago.
        at sun.reflect.GeneratedConstructorAccessor87.newInstance(Unknown
Source)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:409)
        at
com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1122)
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3056)
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2942)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3485)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1960)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2114)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2690)
        at
com.mysql.jdbc.ConnectionImpl.setAutoCommit(ConnectionImpl.java:5093)
        at
org.apache.commons.dbcp.DelegatingConnection.setAutoCommit(DelegatingConnection.java:371)
        at
org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.setAutoCommit(PoolingDataSource.java:328)
        at
org.apache.openjpa.lib.jdbc.DelegatingConnection.setAutoCommit(DelegatingConnection.java:167)
        at
org.apache.openjpa.lib.jdbc.DelegatingConnection.setAutoCommit(DelegatingConnection.java:167)
        at
org.apache.openjpa.lib.jdbc.ConfiguringConnectionDecorator$ConfiguringConnection.setAutoCommit(ConfiguringConnectionDecorator.java:117)
        at
org.apache.openjpa.lib.jdbc.ConfiguringConnectionDecorator$ConfiguringConnection.<init>(ConfiguringConnectionDecorator.java:111)
        at
org.apache.openjpa.lib.jdbc.ConfiguringConnectionDecorator.decorate(ConfiguringConnectionDecorator.java:93)
        at
org.apache.openjpa.lib.jdbc.DecoratingDataSource.decorate(DecoratingDataSource.java:99)
        at
org.apache.openjpa.lib.jdbc.DecoratingDataSource.getConnection(DecoratingDataSource.java:88)
        at
org.apache.openjpa.jdbc.kernel.JDBCStoreManager.connectInternal(JDBCStoreManager.java:979)
        at
org.apache.openjpa.jdbc.kernel.JDBCStoreManager.connect(JDBCStoreManager.java:964)
        ... 23 more
Caused by: java.io.EOFException: Can not read response from server.
Expected to read 4 bytes, read 0 bytes before connection was unexpectedly
lost.
        at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2503)
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2953)
        ... 40 more
2014-11-12 06:39:41,294  INFO
StatusTransitService$StatusTransitRunnable:539 - USER[-] GROUP[-] Acquired
lock for [org.apache.oozie.service.StatusTransitService]
2014-11-12 06:39:41,295  INFO
StatusTransitService$StatusTransitRunnable:539 - USER[-] GROUP[-] Running
coordinator status service from last instance time =  2014-11-12T06:38Z
2014-11-12 06:39:50,756  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:39:56,744  INFO PauseTransitService:539 - USER[-] GROUP[-]
Acquired lock for [org.apache.oozie.service.PauseTransitService]
2014-11-12 06:40:20,757  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:40:50,758  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:41:20,759  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:41:50,759  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:42:20,760  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:42:50,761  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:43:20,762  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:43:50,762  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:44:20,763  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:44:50,764  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:45:20,765  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:45:50,766  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:46:20,766  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:46:50,767  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:47:20,768  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:47:50,769  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:48:20,769  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:48:50,770  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:49:20,771  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:49:50,772  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:50:20,772  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:50:21,455  WARN ParameterVerifier:542 - USER[hdfs] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal
parameters in its XML definition
2014-11-12 06:50:21,557  WARN LiteWorkflowAppService:542 - USER[hdfs]
GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] libpath
[hdfs://cluster1/user/nagios/oozie/wfDistinct/lib] does not exist
2014-11-12 06:50:50,773  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:51:20,775  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:51:50,776  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:52:20,776  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:52:50,777  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:53:20,778  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:53:50,778  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:54:20,779  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:54:50,780  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:55:20,780  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:55:50,781  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:56:20,782  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:56:50,782  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:57:20,783  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:57:50,784  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:58:20,785  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:58:50,786  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:59:20,787  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check
2014-11-12 06:59:50,788  INFO SLACalculatorMemory:539 - USER[-] GROUP[-]
TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA check

Seems like the log just continue to print "INFO SLACalculatorMemory:539 -
USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Running periodic SLA
check" and nothing more.
At this time both oozie servers won't accept any job submission and web
console has no response too. We restart oozie server then it's back to
normal.
Any idea of what may happen? Any suggestion is welcome.

thanks

Reply via email to