Re: [Sequoia] Strange result with recovery

Gérard BUNEL Thu, 01 Feb 2007 02:41:39 -0800

Title: ALTRAN OUEST

Guillaume Smet a écrit :

On 2/1/07, Gérard BUNEL <[EMAIL PROTECTED]> wrote:

I'm maybe completely stupid, maybed really tired.
The test I tried was not correct.
My objective was to test a backend failure, but as I stopped mysql services, I also put the RecoveryLog in a failed state (yes both are on the same machine, not having sufficient cpu resources).

:)

But, this point out some feature I'm looking at: Alert generation on backend failure, controller failure.

Mmmh, IMHO, it's a real problem that you are able to enable the
backend with a failed recovery log. There's something to be done here.
A crashed recovery log should be detected to cause a controller
failure and you shouldn't be able to enable the failed backend without
at least restoring the recovery log.

Do you have any warning in your log file to tell you that the recovery
log is not accessible?

No, In fact I've many exceptions (each corresponding to a client request) with teir corresponding stack trace.
Then restoring done without any warning/error as shown in logs here-after:
java.net.SocketException
MESSAGE: java.net.ConnectException: Connection refused

STACKTRACE:

java.net.SocketException: java.net.ConnectException: Connection refused
    at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:156)
    at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:276)
    at com.mysql.jdbc.Connection.createNewIO(Connection.java:2666)
    at com.mysql.jdbc.Connection.<init>(Connection.java:1531)
    at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:266)
    at org.continuent.sequoia.controller.connection.DriverManager.getConnectionForDriver(DriverManager.java:266)
    at org.continuent.sequoia.controller.connection.DriverManager.getConnection(DriverManager.java:168)
    at org.continuent.sequoia.controller.recoverylog.RecoveryLog.getDatabaseConnection(RecoveryLog.java:318)
    at org.continuent.sequoia.controller.recoverylog.LoggerThread.getUpdatePreparedStatement(LoggerThread.java:133)
    at org.continuent.sequoia.controller.recoverylog.events.LogRequestCompletionEvent.execute(LogRequestCompletionEvent.java:82)
    at org.continuent.sequoia.controller.recoverylog.LoggerThread.run(LoggerThread.java:732)

** END NESTED EXCEPTION **

Last packet sent to the server was 1 ms ago.)
    at org.continuent.sequoia.controller.recoverylog.RecoveryLog.getDatabaseConnection(RecoveryLog.java:336)
    at org.continuent.sequoia.controller.recoverylog.LoggerThread.getUpdatePreparedStatement(LoggerThread.java:133)
    at org.continuent.sequoia.controller.recoverylog.events.LogRequestCompletionEvent.execute(LogRequestCompletionEvent.java:82)
    at org.continuent.sequoia.controller.recoverylog.LoggerThread.run(LoggerThread.java:732)
2007-01-31 16:17:05,086 INFO backup.backupers.NativeCommandExec Command "mysql -h fr-tig.ago.fr --port=3306 -uTEST --password=TEST MATISSE" logged 0 errors and terminated with exitcode 0
2007-01-31 16:17:05,090 INFO controller.RequestManager.MATISSEDB Recovery of backend fr-tig done.

Emmanuel, is it the expected behavior?

How can this be done ?
I've seen in the log4j.properties that it could be possible to register an Appender on FATAL level message in order to send alarms, but is there somewhere a list of possible FATAL errors from Sequoia ? is backend or controller failure considered as FATAL ?

I don't think they are. I agree with you that there should be some

What I'seen is that a backend failure triggers an ERROR log:

10:36:06,127 ERROR controller.loadbalancer.RAIDb1 Disabling backend mendu because connections failed inside a transaction.

I've not yet tried with a controller failure.

easy way to monitor Sequoia's activity. It's possible to write JMX
services to check that everything is OK in your cluster and call them
from Nagios/your monitoring utility via SNMP.
It could be a cool contribution to Sequoia :).

I may contribute with this if I've no other alternative as I do need to know, and to alert responsible persons of a problem in the cluster

You can also find an example of a JMX service in
doc/examples/jmx/SequoiaNotificationListener.java. It could be cool to
develop a more generic service dedicated to monitor a cluster just by
giving it the controller configuration file.

I can't find this class in the 2.10.4 distribution. maybe in the 3.x ? I need to rely on a stable version. Is this code compatible with 2.10.4 version ?
Is it possible to register a NotificationListener automatically at sequoia controller startup ? or do I need to create a new process to implement this ?

--
Guillaume Smet
Open Wide

_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Gérard BUNEL
Chef de Projet
____________________________________________________________________

Technopôle Brest Iroise
Site du Vernis – CS 23866
29238 Brest Cedex 3
Tél : + 33 2 98 05 43 21
Fax : + 33 2 98 05 20 34
www.altran.com

_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Re: [Sequoia] Strange result with recovery

Reply via email to