Hi,
We are examining Sequoia as a HA solution. However we encounter following
problems after running some hundred thousands of transaction:
Setup:
Two controllers at two servers, each connecting to one Postgres database
backend. Client using "ordered" load balancing to connect to controller 1
then controller 2.
Problems:
1. After tens of thousands of transactions, the Appia group is disconnected.
2007-12-19 13:59:54,523 INFO continuent.hedera.gms
Member(address=/128.128.3.30:43994, uid=128.128.3.30:43994) failed in
Group(gid=s2)
2. After problem at 1, controller 1 encountered write error due to unique
key constraint. The controller 1 is then disabled.
2007-12-19 14:00:32,586 ERROR controller.loadbalancer.RAIDb1 write request
844424930142550 failed:
Backend s2 - BackendWorkerThread for backend 'middleware1' with RAIDb
level:1 failed (ERROR: duplicate key violates unique constraint
"pk_sale_order")
3. After problem at 2, controller 1 is automatically disabled ("show backend
*" show it was disabled). Here we expects all clients will now connect to
controller 2. However, clients can still connect to controller 1, only
return error when they execute query:
Error at client:
org.continuent.sequoia.common.exceptions.driver.DriverSQLException: Message
of cause: null, SQL State: null, Error Code: 1
org.continuent.sequoia.common.exceptions.driver.DriverSQLException: Message
of cause: null at
org.continuent.sequoia.driver.Connection.statementExecuteQuery(Connection.ja
va:2840) at
org.continuent.sequoia.driver.Statement.executeQuery(Statement.java:522)
at org.continuent.sequoia.driver.Statement.executeQuery(Statement.java:495)
at
...
Caused by:
org.continuent.sequoia.common.exceptions.driver.protocol.ControllerCoreExcep
tion
SerializableStackTrace of each cause:
org.continuent.sequoia.common.exceptions.driver.protocol.ControllerCoreExcep
tion
at
org.continuent.sequoia.controller.requestmanager.distributed.RAIDb1Distribut
edRequestManager.execRemoteStatementExecuteQue
ry(RAIDb1DistributedRequestManager.java:170)
at
org.continuent.sequoia.controller.requestmanager.distributed.DistributedRequ
estManager.statementExecuteQuery(DistributedRe
questManager.java:1370)
at
org.continuent.sequoia.controller.virtualdatabase.VirtualDatabase.statementE
xecuteQuery(VirtualDatabase.java:549)
at
org.continuent.sequoia.controller.virtualdatabase.VirtualDatabaseWorkerThrea
d.statementExecuteQuery(VirtualDatabaseWorkerT
hread.java:2175)
at
org.continuent.sequoia.controller.virtualdatabase.VirtualDatabaseWorkerThrea
d.run(VirtualDatabaseWorkerThread.java:442)
Error at controller:
2007-12-19 14:33:13,167 WARN controller.RequestManager.sunbeam-s2 An error
occured while executing remote select request 281474977312374
org.continuent.sequoia.common.exceptions.NoMoreBackendException
at
org.continuent.sequoia.controller.requestmanager.distributed.RAIDb1Distribut
edRequestManager.execRemoteStatementExecuteQue
ry(RAIDb1DistributedRequestManager.java:170)
at
org.continuent.sequoia.controller.requestmanager.distributed.DistributedRequ
estManager.statementExecuteQuery(DistributedRe
questManager.java:1370)
at
org.continuent.sequoia.controller.virtualdatabase.VirtualDatabase.statementE
xecuteQuery(VirtualDatabase.java:549)
at
org.continuent.sequoia.controller.virtualdatabase.VirtualDatabaseWorkerThrea
d.statementExecuteQuery(VirtualDatabaseWorkerT
hread.java:2175)
at
org.continuent.sequoia.controller.virtualdatabase.VirtualDatabaseWorkerThrea
d.run(VirtualDatabaseWorkerThread.java:442)
2007-12-19 14:33:13,169 WARN controller.RequestManager.sunbeam-s2 An error
occured while executing remote select request 281474977312375
org.continuent.sequoia.common.exceptions.NoMoreBackendException
at
org.continuent.sequoia.controller.requestmanager.distributed.RAIDb1Distribut
edRequestManager.execRemoteStatementExecuteQue
ry(RAIDb1DistributedRequestManager.java:170)
at
org.continuent.sequoia.controller.requestmanager.distributed.DistributedRequ
estManager.statementExecuteQuery(DistributedRe
questManager.java:1370)
at
org.continuent.sequoia.controller.virtualdatabase.VirtualDatabase.statementE
xecuteQuery(VirtualDatabase.java:549)
The problem 1 seems not related to protocol, as we tried various Appia
protocol. Do you have any suggestion for us to help investigate the issue?
The problem 3 is critical as all of our clients cannot execute query even
controller 2 is alive and running.
Thanks and Regards
Francis
_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia