Hello guys!
I would like to start discussion on a new resource agent for galera/pacemaker.

Main features:
* Support cluster boostrap
* Support reboot any node in cluster
* Support reboot whole cluster
* To determine which node have latest DB version, we should use galera GTID 
(Global Transaction ID)
* Node with latest GTID is galera PC (primary component) in case of reelection
* Administrator can manually set node as PC

GTID:
* get GTID from mysqld --wsrep-recover or SQL query 'SHOW STATUS LIKE 
‚wsrep_local_state_uuid''
* store GTID as crm_attribute for node (crm_attribute --node $HOSTNAME 
--lifetime $LIFETIME --name gtid --update $GTID)
* on every monitor/stop/start action update GTID for given node
* GTID can have 3 format:
 - XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX:123 - standard cluster-id:commit-id
 - XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX:-1 - standard non initialized cluster, 
00000000-0000-0000-0000-000000000000:-1
 - XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX:INF - commit-id manually set to INF, 
force RA to create new cluster, with master on given node

Check if reelection of PC is needed:
* (node is located in partition with quorum OR we have only 1 node configured 
in cluster) AND galera resource is not running on any node
* GTID is manually set to INF on given node

Check if given node is PC:
* have highest GTID in cluster, in case we have more than one node with 
„highest” GTID, we use CRC32 to choose proper PC.
* GTID is manually set to INF
* in case node with highest GTID will not come back after cluster reboot (for 
example disk failure) administrator should set GTID to INF on other node

I have almost ready RA: http://zynzel.spof.pl/mysql-wss

Tested with vanila centos galera/pacemaker/corosync - OK
Tested with Fuel 4.1 - Fail


Fuel 4.1 with that RA will not deploy correctly, because we use crm_attribute 
to store GTID, and in manifest we use cs_shadow/cs_commit for every pacemaker 
resource.
This lead to cs_commit problem with different configuration in shadow copy and 
running configuration (running config changed by RA).
"Could not commit shadow instance [..] to the CIB: Application of an update 
diff failed”

To solve this we can go in 2 ways:
1) dont use cs_commit/cs_shadow in manifests 
2) store GTID in other way than crm_attribute

IMHO 2) is better (less invasive) and we can store GTID in corosync CMAP 
(http://www.polarhome.com/service/man/generic.php?qf=corosync-cmapctl), but 
this require corosync 2.X


-- 
Mailing list: https://launchpad.net/~fuel-dev
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~fuel-dev
More help   : https://help.launchpad.net/ListHelp

Reply via email to