An immediate failover seems rather drastic too me. However, I have no anecdotal 
evidence to back up this feeling or any other default config changes. Maybe 
Joshua can share what they are using so that we can adjust the default values 
accordingly?

Other thoughts:
• Have you tried this magic trick? 
https://issues.apache.org/jira/browse/AURORA-1211?focusedCommentId=14483533&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14483533
• A counter for failed scheduler backups sounds like a good idea. We could also 
mention this in our monitoring documentation.

Best regards,
Stephan

On 01/10/16 02:34, "Zameer Manji" <zma...@apache.org> wrote:

    Aurora Developers and Users,
    
    I would like to share failure case I experienced recently. In a modestly
    sized production cluster with high read load, snapshot creation began to
    fail. The logs showed the following:
    
    ````
    W0923 00:23:55.528 [LogStorage-0, LogStorage:473]
    ### Error rolling back transaction.  Cause: java.sql.SQLException: Error
    accessing PooledConnection. Connection is invalid.
    ### Cause: java.sql.SQLException: Error accessing PooledConnection.
    Connection is invalid. org.apache.ibatis.exceptions.PersistenceException:
    ### Error rolling back transaction.  Cause: java.sql.SQLException: Error
    accessing PooledConnection. Connection is invalid.
    ### Cause: java.sql.SQLException: Error accessing PooledConnection.
    Connection is invalid.
    at
    
org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
    at
    
org.apache.ibatis.session.defaults.DefaultSqlSession.rollback(DefaultSqlSession.java:216)
    at
    
org.apache.ibatis.session.SqlSessionManager.rollback(SqlSessionManager.java:299)
    at
    
org.mybatis.guice.transactional.TransactionalMethodInterceptor.invoke(TransactionalMethodInterceptor.java:116)
    at
    
org.apache.aurora.scheduler.storage.db.DbStorage.lambda$write$0(DbStorage.java:175)
    at
    
org.apache.aurora.scheduler.async.GatingDelayExecutor.closeDuring(GatingDelayExecutor.java:62)
    at
    org.apache.aurora.scheduler.storage.db.DbStorage.write(DbStorage.java:173)
    at
    
org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
    at
    
org.apache.aurora.scheduler.storage.log.LogStorage.doInTransaction(LogStorage.java:521)
    at
    
org.apache.aurora.scheduler.storage.log.LogStorage.write(LogStorage.java:551)
    at
    
org.apache.aurora.scheduler.storage.log.LogStorage.doSnapshot(LogStorage.java:489)
    at
    
org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
    at
    
org.apache.aurora.scheduler.storage.log.LogStorage.snapshot(LogStorage.java:565)
    at
    
org.apache.aurora.scheduler.storage.log.LogStorage.lambda$scheduleSnapshots$20(LogStorage.java:468)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at
    
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at
    
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at
    
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at
    
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
    Caused by: java.sql.SQLException: Error accessing PooledConnection.
    Connection is invalid.
    at
    
org.apache.ibatis.datasource.pooled.PooledConnection.checkConnection(PooledConnection.java:254)
    at
    
org.apache.ibatis.datasource.pooled.PooledConnection.invoke(PooledConnection.java:243)
    at com.sun.proxy.$Proxy135.getAutoCommit(Unknown Source)
    at
    
org.apache.ibatis.transaction.jdbc.JdbcTransaction.rollback(JdbcTransaction.java:79)
    at org.apache.ibatis.executor.BaseExecutor.rollback(BaseExecutor.java:249)
    at
    
org.apache.ibatis.executor.CachingExecutor.rollback(CachingExecutor.java:119)
    at
    
org.apache.ibatis.session.defaults.DefaultSqlSession.rollback(DefaultSqlSession.java:213)
    ... 19 common frames omitted
    ````
    
    This failure is silent and can be observed only through the
    `scheduler_log_snapshots` metric, if it isn't increasing then snapshots are
    not being created. In this cluster, a snapshot was not taken for about 4
    days.
    For those unfamiliar with Aurora's replicated log storage system, snapshot
    creation is important because it allows us to truncate the number of
    entries in the replicated log to a single large entry. This is required
    because the log recovery time is proportional to the number of entries in
    the log. Operators can observe the amount of time it takes to recover the
    log at startup via the `scheduler_log_recover_nanos_total` metric.
    
    The largest value observed for `scheduler_log_recover_nanos_total` during
    this period was 8 minutes. This means that recovery from a failover would
    take at least 8 minutes. For reference, a system aiming to have 99.99%
    uptime can only sustain 4 minutes of downtime a month.
    
    The root cause of this can be found from the exception in the above stack
    trace:
    `Caused by: java.sql.SQLException: Error accessing PooledConnection.
    Connection is invalid.`
    This originates from the MyBatis connection pool used to communicate with
    the in memory SQL store. To create a snapshot, we run a `SCRIPT` query to
    dump the entire database into the replicated log [1].
    
    This exception is being thrown because we have a connection pool to
    communicate with the H2 SQL database. By default the connection pool has
    the following properties:
    * Maximum 10 active connections
    * Maximum connection time of 20s before being considered for eviction.
    
    Under high read load, there can be many pending SQL queries for a
    connection. If a single connection takes more than 20s it will likely be
    evicted. In this case running one of the `SCRIPT` queries was taking more
    than 20s and there were many pending queries which caused MyBatis to evict
    the connection for the `SCRIPT` query, causing snapshot creation failure.
    
    To fix this issue, operators used the `-db_max_active_connection_count` to
    increase the maximum number of active connections for MyBatis to 100. Once
    the scheduler was able to serve requests, operators used `aurora_admin
    scheduler_snapshot` to force create a snapshot. Then a scheduler failover
    was induced and it was observed that recovery time dropped to about 40
    seconds.
    
    Today this cluster continues running with this flag and value to ensure it
    can continue to serve a high read load.
    
    I would like to raise three questions:
    * Should we add a flag to tune the maximum connection time for MyBatis?
    * Should a Snapshot creation failure be fatal?
    * Should we change the default maximum connection time and maximum number
    of active connections?
    
    [1]:
    
https://github.com/apache/aurora/blob/rel/0.16.0/src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java#L107-L127
    
    --
    Zameer Manji
    

Reply via email to