Steven Hancz created SENTRY-1965:
------------------------------------
Summary: sentry server database database Destroy connection
exception (Galera cluster)
Key: SENTRY-1965
URL: https://issues.apache.org/jira/browse/SENTRY-1965
Project: Sentry
Issue Type: Bug
Components: Sentry
Affects Versions: 1.5.1
Environment: Sentry 1.5.1
CDH 5.9.1
Mysql 5.6.35
Galera 25.18
Reporter: Steven Hancz
We have implemented an HA solution for the Sentry server database.
Basically instead of using a single MySQL server we have a Galera cluster that
is accessed via a DNS load balanced VIP. So that if one MySQL server stops
working the VIP will detect and send the DB request to the surviving node. A
similar set up is working for the HIVE metastore.
However we noticed that Sentry just like spark uses the BoneCP connection pool
to connect to the database. There are some hard codded configuration options in
the bonecp-default-config.xml that are causing issues with Sentry.
idleConnectionTestPeriodInMinutes default 240 minutes
idleMaxAgeInMinutes default 60 minutes
Based on this BonceCP will test each idle connection every 240 minutes
but an idle connection is closed after 60 minutes (second parameter) so the
connection testing will never take place as the connection will be closed after
60 minutes. The test takes place every 240 minutes.
However in an HA configuration with a VIP you can set the connection time out
and how often to test for target availability.
We had the exact same problem for hive there the work around was to include a
second configuration file for BoneCP called bonecp-config.xml. This was added
to the hive server jar. The second config file (bonecp-config.xml) contains
idleConnectionTestPeriodInMinutes 1
idleMaxAgeInMinutes 5
So that every connection is tested every minute and an idle connection is
closed after 5 minutes. But since we test it every minute they will be kept
alive.
So the question is how to enable a similar setting for Sentry ?
With default boneCP configuration and Galera cluster in the back end Sentry is
returning the following error:
Sep 26, 7:30:39.900 AM ERROR com.jolbox.bonecp.ConnectionTesterThread
Destroy connection exception
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
Communications link failure during rollback(). Transaction resolution unknown.
at sun.reflect.GeneratedConstructorAccessor41.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
at com.mysql.jdbc.Util.getInstance(Util.java:387)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:917)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:896)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:885)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860)
at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4634)
at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4263)
at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1519)
at
com.jolbox.bonecp.ConnectionHandle.internalClose(ConnectionHandle.java:396)
at
com.jolbox.bonecp.ConnectionTesterThread.closeConnection(ConnectionTesterThread.java:155)
at
com.jolbox.bonecp.ConnectionTesterThread.run(ConnectionTesterThread.java:95)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
>From our research it appears that sentry server is using BoneCP class but in
>more than one location. Changing the parameters in BoneCP for Sentry alone is
>does not appear to be sufficient. Trace file shows that parameters are not
>changed and time outs are default boncecp parameter. Where else do we have to
>change boneCP config?
Regards,
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)