[h2] Re: Data loss with H2 HA / Clustered Mode

dcheney Mon, 07 Oct 2013 13:00:42 -0700

Hi,

Noel invited me to join the discussion, and I hope nobody objects (or 
better yet, someone finds something useful) if I cross-post (updated with 
more current info) here re clustered H2.  We are also trying to use it and 
most of what I have seen looks pretty good.


Our requirement is for a relatively small DB with a low rate of 
transactions.  I'd like to be able to have a db clone on each of 3 nodes. 
 Clients would be directed to one node via a virtual IP; the cluster would 
apply all operations at that IP and onto two copies.  If either *copy* (not 
the 'master' at the virtual IP) fails the cluster continues to operate 
normally; after the node is recovered CreateCluster is re-run as usual, 
albeit with 3 nodes.  If the master (the node at the virtual IP) fails our 
code selects a new master from either of the 2 copies.  That node takes 
over the virtual IP and the cluster continues operating normally.  We 
detect when the failed node returns and issue a "CreateCluster" to 
reincorporate that node.

As far as I can see this imposes only a single requirement on H2 that is 
different than usual: support for 3 nodes in a CreateCluster -serverList.  
I've begun exploring this a bit - I have a working 3 node cluster (nodes 
.55, .53, and .51).  I run the server on .53 and .51, e.g.
java -cp /usr/lib/java/*:. org.h2.tools.Server -tcp -tcpPort 9101 
-tcpAllowOthers -baseDir <base> & 
Then on my 'first' server (.55) I successfully run CreateCluster (twice) to 
create db copies on node both .53 and .51:
java -cp /usr/lib/java/*:. org.h2.tools.CreateCluster \
    -urlSource jdbc:h2:tcp://172.16.35.55:9101/msp \
    -urlTarget jdbc:h2:tcp://172.16.35.51:9101/msp \
    -serverList 172.16.35.55:9101,172.16.35.51:9101,172.16.35.53:9101
The H2 logs show an update to the CLUSTER server list that includes all 
three servers.
/*SQL t:1*/SET CLUSTER '172.16.35.55:9101,172.16.35.51:9101,
172.16.35.53:9101';
Code in SessionRemote loops through the serverlist applying any SQL 
operation.  I also see this interesting comment at line 397: 
      // TODO cluster: support more than 2 connections
What would need to be done?

I've:
- run a number of SQL operations from multiple connections and all 3 server 
DB copies are updated as expected.
- failed an H2 server and/or node and successfully issued operations on the 
remaining cluster.
- reincorporated a failed node using "CreateCluster"
- successfully issued operations against the recovered cluster.
- obtained expected behaviors when attempting to insert a duplicate key and 
to update or delete a missing record - basic stuff.
- obtained quite unexpected behavior by issuing 'ipconfig eth0 off' on a 
node, which halts the entire database until the adapter returns.  Ouch.

I also attempted to create a SQL operation failure on a single node, hoping 
to receive an exception and see logs indicating a rollback.  I found that I 
was able to use a URL identifying a single clustered node (more below), 
removed a record on only that node, and then attempted to update that 
record at the cluster.  My takeaway so far is that I am able to break 
consistency and that the database does not reliably detect and rollback an 
operation upon a record that is missing on a node.   I've not explored this 
very thoroughly and I'm unclear on what behavior to expect with the current 
H2 - especially on an unsupported 3 node configuration.

While looking at logs I noted that under some circumstances - stop and 
restart one H2 server, then connect and disconnect using a normal multihost 
URL - the value of CLUSTER is set empty:    
10-02 16:15:02 jdbc[14]: /*SQL */SET AUTO_RECONNECT TRUE;
10-02 16:15:02 jdbc[14]: /*SQL */SET CLUSTER TRUE;
10-02 16:15:02 jdbc[14]: /*SQL */SET TRACE_LEVEL_SYSTEM_OUT 3;
10-02 16:15:02 jdbc[14]: /*SQL */SET DB_CLOSE_DELAY -1;
10-02 16:15:04 index: SYS_DATA remove ( /* key:119 */ 58, 0, 6, 'SET 
CLUSTER ''172.16.35.55:9101,172.16.35.53:9101,172.16.35.51:9101''')
10-02 16:15:04 pageStore: log - s: 1 table: 0 row: ( /* key:119 */ 58, 0, 
6, 'SET CLUSTER ''172.16.35.55:9101,172.16.35.53:9101,172.16.35.51:9101''')
10-02 16:15:04 index: SYS_DATA add ( /* key:120 */ 58, 0, 6, 'SET CLUSTER 
''''')
10-02 16:15:04 pageStore: log + s: 1 table: 0 row: ( /* key:120 */ 58, 0, 
6, 'SET CLUSTER ''''')
Though this is a cluster, at this point you can connect and issue 
operations against a JDBC URL identifying single node in the cluster.  The 
documented SELECT VALUE FROM INFORMATION_SCHEMA.SETTINGS WHERE 
NAME='CLUSTER'; returns an empty '' from successful connect til first 
write.  Not a surprise, but not the expected or documented behavior.  
Following any write operation the expected list of servers is restored.

If I understand the H2 clustering model, its approach is to detect a failed 
server(/node), explicitly remove it from the cluster, and continue 
operating as a "healthy" cluster of one (less) node.  Apart from this, 
operations are always expected to complete on all nodes.  That's pretty 
blunt, but if it is "good enough" for a particular application then it is.  
We really like H2 and I'm guessing that there is a pretty substantial 
market for a lightweight solution with good enough behavior.  If we can 
converge on what is good enough between a few of us, maybe we can make this 
work.  Thoughts? 

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/groups/opt_out.

[h2] Re: Data loss with H2 HA / Clustered Mode

Reply via email to