Re: [ovirt-users] sanlock + gluster recovery -- RFE
I am sorry, this missed my attention over the last few days. On 05/23/2014 08:50 PM, Ted Miller wrote: Vijay, I am not a member of the developer list, so my comments are at end. On 5/23/2014 6:55 AM, Vijay Bellur wrote: On 05/21/2014 10:22 PM, Federico Simoncelli wrote: - Original Message - From: Giuseppe Ragusa giuseppe.rag...@hotmail.com To: fsimo...@redhat.com Cc: users@ovirt.org Sent: Wednesday, May 21, 2014 5:15:30 PM Subject: sanlock + gluster recovery -- RFE Hi, - Original Message - From: Ted Miller tmiller at hcjb.org To: users users at ovirt.org Sent: Tuesday, May 20, 2014 11:31:42 PM Subject: [ovirt-users] sanlock + gluster recovery -- RFE As you are aware, there is an ongoing split-brain problem with running sanlock on replicated gluster storage. Personally, I believe that this is the 5th time that I have been bitten by this sanlock+gluster problem. I believe that the following are true (if not, my entire request is probably off base). * ovirt uses sanlock in such a way that when the sanlock storage is on a replicated gluster file system, very small storage disruptions can result in a gluster split-brain on the sanlock space Although this is possible (at the moment) we are working hard to avoid it. The hardest part here is to ensure that the gluster volume is properly configured. The suggested configuration for a volume to be used with ovirt is: Volume Name: (...) Type: Replicate Volume ID: (...) Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: (...three bricks...) Options Reconfigured: network.ping-timeout: 10 cluster.quorum-type: auto The two options ping-timeout and quorum-type are really important. You would also need a build where this bug is fixed in order to avoid any chance of a split-brain: https://bugzilla.redhat.com/show_bug.cgi?id=1066996 It seems that the aforementioned bug is peculiar to 3-bricks setups. I understand that a 3-bricks setup can allow proper quorum formation without resorting to first-configured-brick-has-more-weight convention used with only 2 bricks and quorum auto (which makes one node special, so not properly any-single-fault tolerant). Correct. But, since we are on ovirt-users, is there a similar suggested configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management properly configured and tested-working? I mean a configuration where any host can go south and oVirt (through the other one) fences it (forcibly powering it off with confirmation from IPMI or similar) then restarts HA-marked vms that were running there, all the while keeping the underlying GlusterFS-based storage domains responsive and readable/writeable (maybe apart from a lapse between detected other-node unresposiveness and confirmed fencing)? We already had a discussion with gluster asking if it was possible to add fencing to the replica 2 quorum/consistency mechanism. The idea is that as soon as you can't replicate a write you have to freeze all IO until either the connection is re-established or you know that the other host has been killed. Adding Vijay. There is a related thread on gluster-devel [1] to have a better behavior in GlusterFS for prevention of split brains with sanlock and 2-way replicated gluster volumes. Please feel free to comment on the proposal there. Thanks, Vijay [1] http://supercolony.gluster.org/pipermail/gluster-devel/2014-May/040751.html One quick note before my main comment: I see references to quorum being N/2 + 1. Isn't if more accurate to say that quorum is (N + 1)/2 or N/2 + 0.5? (N + 1)/2 or N/2 + 0.5 is fine when N happens to be odd. For both odd and even cases of N, N/2 + 1 does seem to be the more appropriate representation (assuming integer arithmetic). Now to my main comment. I see a case that is not being addressed. I have no proof of how often this use-case occurs, but I believe that is does occur. (It could (theoretically) occur in any situation where multiple bricks are writing to different parts of the same file.) Use-case: sanlock via fuse client. Steps to produce originally (not tested for reproducibility, because I was unable to recover the ovirt cluster after occurrence, had to rebuild from scratch), time frame was late 2013 or early 2014 2 node ovirt cluster using replicated gluster storage ovirt cluster up and running VMs remove power from network switch restore power to network switch after a few minutes Result both copies of .../dom_md/ids file accused the other of being out of sync This case would fall under the ambit of 1. Split-brains due to network partition or network split-brains in the proposal on gluster-devel. Possible solutions Thinking about it on a systems level, the only solution I can see is to route all writes through one gluster brick. That way all the accusations flow from that brick to other bricks, and gluster will find the one file
Re: [ovirt-users] sanlock + gluster recovery -- RFE
On 05/21/2014 10:22 PM, Federico Simoncelli wrote: - Original Message - From: Giuseppe Ragusa giuseppe.rag...@hotmail.com To: fsimo...@redhat.com Cc: users@ovirt.org Sent: Wednesday, May 21, 2014 5:15:30 PM Subject: sanlock + gluster recovery -- RFE Hi, - Original Message - From: Ted Miller tmiller at hcjb.org To: users users at ovirt.org Sent: Tuesday, May 20, 2014 11:31:42 PM Subject: [ovirt-users] sanlock + gluster recovery -- RFE As you are aware, there is an ongoing split-brain problem with running sanlock on replicated gluster storage. Personally, I believe that this is the 5th time that I have been bitten by this sanlock+gluster problem. I believe that the following are true (if not, my entire request is probably off base). * ovirt uses sanlock in such a way that when the sanlock storage is on a replicated gluster file system, very small storage disruptions can result in a gluster split-brain on the sanlock space Although this is possible (at the moment) we are working hard to avoid it. The hardest part here is to ensure that the gluster volume is properly configured. The suggested configuration for a volume to be used with ovirt is: Volume Name: (...) Type: Replicate Volume ID: (...) Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: (...three bricks...) Options Reconfigured: network.ping-timeout: 10 cluster.quorum-type: auto The two options ping-timeout and quorum-type are really important. You would also need a build where this bug is fixed in order to avoid any chance of a split-brain: https://bugzilla.redhat.com/show_bug.cgi?id=1066996 It seems that the aforementioned bug is peculiar to 3-bricks setups. I understand that a 3-bricks setup can allow proper quorum formation without resorting to first-configured-brick-has-more-weight convention used with only 2 bricks and quorum auto (which makes one node special, so not properly any-single-fault tolerant). Correct. But, since we are on ovirt-users, is there a similar suggested configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management properly configured and tested-working? I mean a configuration where any host can go south and oVirt (through the other one) fences it (forcibly powering it off with confirmation from IPMI or similar) then restarts HA-marked vms that were running there, all the while keeping the underlying GlusterFS-based storage domains responsive and readable/writeable (maybe apart from a lapse between detected other-node unresposiveness and confirmed fencing)? We already had a discussion with gluster asking if it was possible to add fencing to the replica 2 quorum/consistency mechanism. The idea is that as soon as you can't replicate a write you have to freeze all IO until either the connection is re-established or you know that the other host has been killed. Adding Vijay. There is a related thread on gluster-devel [1] to have a better behavior in GlusterFS for prevention of split brains with sanlock and 2-way replicated gluster volumes. Please feel free to comment on the proposal there. Thanks, Vijay [1] http://supercolony.gluster.org/pipermail/gluster-devel/2014-May/040751.html ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] sanlock + gluster recovery -- RFE
Vijay, I am not a member of the developer list, so my comments are at end. On 5/23/2014 6:55 AM, Vijay Bellur wrote: On 05/21/2014 10:22 PM, Federico Simoncelli wrote: - Original Message - From: Giuseppe Ragusa giuseppe.rag...@hotmail.com To: fsimo...@redhat.com Cc: users@ovirt.org Sent: Wednesday, May 21, 2014 5:15:30 PM Subject: sanlock + gluster recovery -- RFE Hi, - Original Message - From: Ted Miller tmiller at hcjb.org To: users users at ovirt.org Sent: Tuesday, May 20, 2014 11:31:42 PM Subject: [ovirt-users] sanlock + gluster recovery -- RFE As you are aware, there is an ongoing split-brain problem with running sanlock on replicated gluster storage. Personally, I believe that this is the 5th time that I have been bitten by this sanlock+gluster problem. I believe that the following are true (if not, my entire request is probably off base). * ovirt uses sanlock in such a way that when the sanlock storage is on a replicated gluster file system, very small storage disruptions can result in a gluster split-brain on the sanlock space Although this is possible (at the moment) we are working hard to avoid it. The hardest part here is to ensure that the gluster volume is properly configured. The suggested configuration for a volume to be used with ovirt is: Volume Name: (...) Type: Replicate Volume ID: (...) Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: (...three bricks...) Options Reconfigured: network.ping-timeout: 10 cluster.quorum-type: auto The two options ping-timeout and quorum-type are really important. You would also need a build where this bug is fixed in order to avoid any chance of a split-brain: https://bugzilla.redhat.com/show_bug.cgi?id=1066996 It seems that the aforementioned bug is peculiar to 3-bricks setups. I understand that a 3-bricks setup can allow proper quorum formation without resorting to first-configured-brick-has-more-weight convention used with only 2 bricks and quorum auto (which makes one node special, so not properly any-single-fault tolerant). Correct. But, since we are on ovirt-users, is there a similar suggested configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management properly configured and tested-working? I mean a configuration where any host can go south and oVirt (through the other one) fences it (forcibly powering it off with confirmation from IPMI or similar) then restarts HA-marked vms that were running there, all the while keeping the underlying GlusterFS-based storage domains responsive and readable/writeable (maybe apart from a lapse between detected other-node unresposiveness and confirmed fencing)? We already had a discussion with gluster asking if it was possible to add fencing to the replica 2 quorum/consistency mechanism. The idea is that as soon as you can't replicate a write you have to freeze all IO until either the connection is re-established or you know that the other host has been killed. Adding Vijay. There is a related thread on gluster-devel [1] to have a better behavior in GlusterFS for prevention of split brains with sanlock and 2-way replicated gluster volumes. Please feel free to comment on the proposal there. Thanks, Vijay [1] http://supercolony.gluster.org/pipermail/gluster-devel/2014-May/040751.html One quick note before my main comment: I see references to quorum being N/2 + 1. Isn't if more accurate to say that quorum is (N + 1)/2 or N/2 + 0.5? Now to my main comment. I see a case that is not being addressed. I have no proof of how often this use-case occurs, but I believe that is does occur. (It could (theoretically) occur in any situation where multiple bricks are writing to different parts of the same file.) Use-case: sanlock via fuse client. Steps to produce originally (not tested for reproducibility, because I was unable to recover the ovirt cluster after occurrence, had to rebuild from scratch), time frame was late 2013 or early 2014 2 node ovirt cluster using replicated gluster storage ovirt cluster up and running VMs remove power from network switch restore power to network switch after a few minutes Result both copies of .../dom_md/ids file accused the other of being out of sync Hypothesis of cause servers (ovirt nodes and gluster bricks) are called A and B At the moment when network communication was lost, or just a moment after communication was lost A had written to local ids file A had started process to send write to B A had not received write confirmation from B and B had written to local ids file B had started process to send write to A B had not received write confirmation from A Thus, each file had a segment that had been written to the local file, but had not been confirmed written on the remote file. Each file correctly accused the other file of being out-of-sync. I did read
Re: [ovirt-users] sanlock + gluster recovery -- RFE
- Original Message - From: Ted Miller tmil...@hcjb.org To: users users@ovirt.org Sent: Tuesday, May 20, 2014 11:31:42 PM Subject: [ovirt-users] sanlock + gluster recovery -- RFE As you are aware, there is an ongoing split-brain problem with running sanlock on replicated gluster storage. Personally, I believe that this is the 5th time that I have been bitten by this sanlock+gluster problem. I believe that the following are true (if not, my entire request is probably off base). * ovirt uses sanlock in such a way that when the sanlock storage is on a replicated gluster file system, very small storage disruptions can result in a gluster split-brain on the sanlock space Although this is possible (at the moment) we are working hard to avoid it. The hardest part here is to ensure that the gluster volume is properly configured. The suggested configuration for a volume to be used with ovirt is: Volume Name: (...) Type: Replicate Volume ID: (...) Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: (...three bricks...) Options Reconfigured: network.ping-timeout: 10 cluster.quorum-type: auto The two options ping-timeout and quorum-type are really important. You would also need a build where this bug is fixed in order to avoid any chance of a split-brain: https://bugzilla.redhat.com/show_bug.cgi?id=1066996 How did I get into this mess? ... What I would like to see in ovirt to help me (and others like me). Alternates listed in order from most desirable (automatic) to least desirable (set of commands to type, with lots of variables to figure out). The real solution is to avoid the split-brain altogether. At the moment it seems that using the suggested configurations and the bug fix we shouldn't hit a split-brain. 1. automagic recovery 2. recovery subcommand 3. script 4. commands I think that the commands to resolve a split-brain should be documented. I just started a page here: http://www.ovirt.org/Gluster_Storage_Domain_Reference Could you add your documentation there? Thanks! -- Federico ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] sanlock + gluster recovery -- RFE
Hi, - Original Message - From: Ted Miller tmiller at hcjb.org To: users users at ovirt.org Sent: Tuesday, May 20, 2014 11:31:42 PM Subject: [ovirt-users] sanlock + gluster recovery -- RFE As you are aware, there is an ongoing split-brain problem with running sanlock on replicated gluster storage. Personally, I believe that this is the 5th time that I have been bitten by this sanlock+gluster problem. I believe that the following are true (if not, my entire request is probably off base). * ovirt uses sanlock in such a way that when the sanlock storage is on a replicated gluster file system, very small storage disruptions can result in a gluster split-brain on the sanlock space Although this is possible (at the moment) we are working hard to avoid it. The hardest part here is to ensure that the gluster volume is properly configured. The suggested configuration for a volume to be used with ovirt is: Volume Name: (...) Type: Replicate Volume ID: (...) Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: (...three bricks...) Options Reconfigured: network.ping-timeout: 10 cluster.quorum-type: auto The two options ping-timeout and quorum-type are really important. You would also need a build where this bug is fixed in order to avoid any chance of a split-brain: https://bugzilla.redhat.com/show_bug.cgi?id=1066996 It seems that the aforementioned bug is peculiar to 3-bricks setups. I understand that a 3-bricks setup can allow proper quorum formation without resorting to first-configured-brick-has-more-weight convention used with only 2 bricks and quorum auto (which makes one node special, so not properly any-single-fault tolerant). But, since we are on ovirt-users, is there a similar suggested configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management properly configured and tested-working? I mean a configuration where any host can go south and oVirt (through the other one) fences it (forcibly powering it off with confirmation from IPMI or similar) then restarts HA-marked vms that were running there, all the while keeping the underlying GlusterFS-based storage domains responsive and readable/writeable (maybe apart from a lapse between detected other-node unresposiveness and confirmed fencing)? Furthermore: is such a suggested configuration possible in a self-hosted-engine scenario? Regards, Giuseppe How did I get into this mess? ... What I would like to see in ovirt to help me (and others like me). Alternates listed in order from most desirable (automatic) to least desirable (set of commands to type, with lots of variables to figure out). The real solution is to avoid the split-brain altogether. At the moment it seems that using the suggested configurations and the bug fix we shouldn't hit a split-brain. 1. automagic recovery 2. recovery subcommand 3. script 4. commands I think that the commands to resolve a split-brain should be documented. I just started a page here: http://www.ovirt.org/Gluster_Storage_Domain_Reference Could you add your documentation there? Thanks! -- Federico ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] sanlock + gluster recovery -- RFE
- Original Message - From: Giuseppe Ragusa giuseppe.rag...@hotmail.com To: fsimo...@redhat.com Cc: users@ovirt.org Sent: Wednesday, May 21, 2014 5:15:30 PM Subject: sanlock + gluster recovery -- RFE Hi, - Original Message - From: Ted Miller tmiller at hcjb.org To: users users at ovirt.org Sent: Tuesday, May 20, 2014 11:31:42 PM Subject: [ovirt-users] sanlock + gluster recovery -- RFE As you are aware, there is an ongoing split-brain problem with running sanlock on replicated gluster storage. Personally, I believe that this is the 5th time that I have been bitten by this sanlock+gluster problem. I believe that the following are true (if not, my entire request is probably off base). * ovirt uses sanlock in such a way that when the sanlock storage is on a replicated gluster file system, very small storage disruptions can result in a gluster split-brain on the sanlock space Although this is possible (at the moment) we are working hard to avoid it. The hardest part here is to ensure that the gluster volume is properly configured. The suggested configuration for a volume to be used with ovirt is: Volume Name: (...) Type: Replicate Volume ID: (...) Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: (...three bricks...) Options Reconfigured: network.ping-timeout: 10 cluster.quorum-type: auto The two options ping-timeout and quorum-type are really important. You would also need a build where this bug is fixed in order to avoid any chance of a split-brain: https://bugzilla.redhat.com/show_bug.cgi?id=1066996 It seems that the aforementioned bug is peculiar to 3-bricks setups. I understand that a 3-bricks setup can allow proper quorum formation without resorting to first-configured-brick-has-more-weight convention used with only 2 bricks and quorum auto (which makes one node special, so not properly any-single-fault tolerant). Correct. But, since we are on ovirt-users, is there a similar suggested configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management properly configured and tested-working? I mean a configuration where any host can go south and oVirt (through the other one) fences it (forcibly powering it off with confirmation from IPMI or similar) then restarts HA-marked vms that were running there, all the while keeping the underlying GlusterFS-based storage domains responsive and readable/writeable (maybe apart from a lapse between detected other-node unresposiveness and confirmed fencing)? We already had a discussion with gluster asking if it was possible to add fencing to the replica 2 quorum/consistency mechanism. The idea is that as soon as you can't replicate a write you have to freeze all IO until either the connection is re-established or you know that the other host has been killed. Adding Vijay. -- Federico ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] sanlock + gluster recovery -- RFE
On 5/21/2014 11:15 AM, Giuseppe Ragusa wrote: Hi, - Original Message - From: Ted Miller tmiller at hcjb.org To: users users at ovirt.org Sent: Tuesday, May 20, 2014 11:31:42 PM Subject: [ovirt-users] sanlock + gluster recovery -- RFE As you are aware, there is an ongoing split-brain problem with running sanlock on replicated gluster storage. Personally, I believe that this is the 5th time that I have been bitten by this sanlock+gluster problem. I believe that the following are true (if not, my entire request is probably off base). * ovirt uses sanlock in such a way that when the sanlock storage is on a replicated gluster file system, very small storage disruptions can result in a gluster split-brain on the sanlock space Although this is possible (at the moment) we are working hard to avoid it. The hardest part here is to ensure that the gluster volume is properly configured. The suggested configuration for a volume to be used with ovirt is: Volume Name: (...) Type: Replicate Volume ID: (...) Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: (...three bricks...) Options Reconfigured: network.ping-timeout: 10 cluster.quorum-type: auto The two options ping-timeout and quorum-type are really important. You would also need a build where this bug is fixed in order to avoid any chance of a split-brain: https://bugzilla.redhat.com/show_bug.cgi?id=1066996 It seems that the aforementioned bug is peculiar to 3-bricks setups. I understand that a 3-bricks setup can allow proper quorum formation without resorting to first-configured-brick-has-more-weight convention used with only 2 bricks and quorum auto (which makes one node special, so not properly any-single-fault tolerant). But, since we are on ovirt-users, is there a similar suggested configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management properly configured and tested-working? I mean a configuration where any host can go south and oVirt (through the other one) fences it (forcibly powering it off with confirmation from IPMI or similar) then restarts HA-marked vms that were running there, all the while keeping the underlying GlusterFS-based storage domains responsive and readable/writeable (maybe apart from a lapse between detected other-node unresposiveness and confirmed fencing)? Furthermore: is such a suggested configuration possible in a self-hosted-engine scenario? Regards, Giuseppe How did I get into this mess? ... What I would like to see in ovirt to help me (and others like me). Alternates listed in order from most desirable (automatic) to least desirable (set of commands to type, with lots of variables to figure out). The real solution is to avoid the split-brain altogether. At the moment it seems that using the suggested configurations and the bug fix we shouldn't hit a split-brain. 1. automagic recovery 2. recovery subcommand 3. script 4. commands I think that the commands to resolve a split-brain should be documented. I just started a page here: http://www.ovirt.org/Gluster_Storage_Domain_Reference I suggest you add these lines to the Gluster configuration, as I have seen this come up multiple times on the User list: storage.owner-uid: 36 storage.owner-gid: 36 Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] sanlock + gluster recovery -- RFE
Itamar, I am addressing this to you because one of your assignments seems to be to coordinate other oVirt contributors when dealing with issues that are raised on the ovirt-users email list. As you are aware, there is an ongoing split-brain problem with running sanlock on replicated gluster storage. Personally, I believe that this is the 5th time that I have been bitten by this sanlock+gluster problem. I believe that the following are true (if not, my entire request is probably off base). * ovirt uses sanlock in such a way that when the sanlock storage is on a replicated gluster file system, very small storage disruptions can result in a gluster split-brain on the sanlock space o gluster is aware of the problem, and is working on a different way of replicating data, which will reduce these problems. * most (maybe all) of the sanlock locks have a short duration, measured in seconds * there are only a couple of things that a user can safely do from the command line when a file is in split-brain o delete the file o rename (mv) the file * x _How did I get into this mess?_ had 3 hosts running ovirt 3.3 each hosted VMs gluster replica 3 storage engine was external to cluster upgraded 3 hosts from ovirt 3.3 to 3.4 hosted-engine deploy used new gluster volume (accessed via nfs) for storage storage was accessed using localhost:engVM1 link (localhost was probably a poor choice) created new engine on VM (did not transfer any data from old engine) added 3 hosts to new engine via web-gui ran above setup for 3 days shut entire system down before I left on vacation (holiday) came back from vacation powered on hosts found that iptables did not have rules for gluster access (a continuing problem if host installation is allowed to set up firewall) added rules for gluster glusterfs now up and running added storage manually tried hosted-engine --vm-start vm did not start logs show sanlock errors gluster volume heal engVM1full: gluster volume heal engVM1 info split-brain showed 6 files in split-brain all 5 prefixed by /rhev/data-center/mnt/localhost\:_engVM1 UUID/dom_md/ids UUID/images/UUID/UUID (VM hard disk) UUID/images/UUID/UUID.lease UUID/ha_agent/hosted-engine.lockspace UUID/ha_agent/hosted-engine.metadata I copied each of the above files off of each of the three bricks to a safe place (15 files copied) I renamed the 5 files on /rhev/ I copied the 5 files from one of the bricks to /rhev/ files can now be read OK (e.g. cat ids) sanlock.log shows error sets like these: 2014-05-20 03:23:39-0400 36199 [2843]: s3358 lockspace 5ebb3b40-a394-405b-bbac-4c0e21ccd659:1:/rhev/data-center/mnt/localhost:_engVM1/5ebb3b40-a394-405b-bbac-4c0e21ccd659/dom_md/ids:0 2014-05-20 03:23:39-0400 36199 [18873]: open error -5 /rhev/data-center/mnt/localhost:_engVM1/5ebb3b40-a394-405b-bbac-4c0e21ccd659/dom_md/ids 2014-05-20 03:23:39-0400 36199 [18873]: s3358 open_disk /rhev/data-center/mnt/localhost:_engVM1/5ebb3b40-a394-405b-bbac-4c0e21ccd659/dom_md/ids error -5 2014-05-20 03:23:40-0400 36200 [2843]: s3358 add_lockspace fail result -19 I am now stuck What I would like to see in ovirt to help me (and others like me). Alternates listed in order from most desirable (automatic) to least desirable (set of commands to type, with lots of variables to figure out). 1. automagic recovery * When a host is not able to access sanlock, it writes a small problem text file into the shared storage o the host-ID as part of the name (so only one host ever accesses that file) o a status number for the error causing problems o time stamp o time stamp when last sanlock lease will expire o if sanlock is able to access the file, the problem file is deleted * when time passes for its last sanlock lease to be expired, highest number host does a survey o did all other hosts create problem files? o do all problem files show same (or compatible) error codes related to file access problems? o are all hosts communicating by network? o if yes to all above * delete all sanlock storage space * initialize sanlock from scratch * restart whatever may have given up because of sanlock * restart VM if necessary 2. recovery subcommand * add hosted-engine --lock-initialize command that would delete sanlock, start over from scratch 3. script * publish a script (in ovirt packages or available on web) which, when run, does all (or most) of the recovery process needed. 4. commands * publish on the web a recipe for dealing with files that commonly go split-brain o ids o *.lease o *.lockspace Any chance of any help on any of the above levels? Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users