Re: [ClusterLabs] question about fence-virsh

2017-05-19 Thread Digimer
On 19/05/17 05:30 PM, Ken Gaillot wrote:
> On 05/19/2017 03:47 PM, Andrew Kerber wrote:
>> What I am trying to say here is when I get one of the virtual machines
>> in a bad state, I can still log in and reboot it with the reboot
>> command. But I need my fencing resource to handle that reboot.
>>
>> On Fri, May 19, 2017 at 1:32 PM, Andrew Kerber > > wrote:
>>
>> Thanks for the answer, but thats not the problem.  I dont have
>> access to the console, its a security issue.  I only have access
>> within the virtual machines, so I want to send the reboot command
>> within the virtual machine, not to the console. Typically our
>> hangups are such that the reboot command works, and the machine
>> hangs at starting back up, and I get an admin to go hit the console.
> 
> What you're asking for is an "ssh" fence agent. While such can be found,
> they are not considered reliable fence agents.
> 
> Your *typical* problem may be solvable with running "reboot" inside the
> VM, but there are situations in which that won't work (kernel panic,
> loss of network connectivity in the VM, crippling load, etc.). Only
> access to the hypervisor can provide a reliable fence mechanism for the VM.
> 
> If you're lucky, whoever is providing your VM can also provide you an
> API to use to request a hard reboot of the VM at the hypervisor level.
> Then, you can see if there is a fence agent already written for that
> API, or modify an existing one to handle it.
> 
> If you can't even get API access to the hypervisor, then you're not
> going to get full HA. You could search for an ssh fence agent, but be
> aware that's a partial solution at best, and you won't be able to
> recover from certain failure scenarios.

Ken is correct. Fencing must work no matter what state the victim is in.
You can see this by running 'echo c > /proc/sysrq-trigger' to cause a
kernel panic and your cluster will hang.

You need to talk to your security team to get access to the hypervisor.

-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] question about fence-virsh

2017-05-19 Thread Ken Gaillot
On 05/19/2017 03:47 PM, Andrew Kerber wrote:
> What I am trying to say here is when I get one of the virtual machines
> in a bad state, I can still log in and reboot it with the reboot
> command. But I need my fencing resource to handle that reboot.
> 
> On Fri, May 19, 2017 at 1:32 PM, Andrew Kerber  > wrote:
> 
> Thanks for the answer, but thats not the problem.  I dont have
> access to the console, its a security issue.  I only have access
> within the virtual machines, so I want to send the reboot command
> within the virtual machine, not to the console. Typically our
> hangups are such that the reboot command works, and the machine
> hangs at starting back up, and I get an admin to go hit the console.

What you're asking for is an "ssh" fence agent. While such can be found,
they are not considered reliable fence agents.

Your *typical* problem may be solvable with running "reboot" inside the
VM, but there are situations in which that won't work (kernel panic,
loss of network connectivity in the VM, crippling load, etc.). Only
access to the hypervisor can provide a reliable fence mechanism for the VM.

If you're lucky, whoever is providing your VM can also provide you an
API to use to request a hard reboot of the VM at the hypervisor level.
Then, you can see if there is a fence agent already written for that
API, or modify an existing one to handle it.

If you can't even get API access to the hypervisor, then you're not
going to get full HA. You could search for an ssh fence agent, but be
aware that's a partial solution at best, and you won't be able to
recover from certain failure scenarios.


> On Fri, May 19, 2017 at 12:39 PM, Digimer  > wrote:
> 
> On 19/05/17 12:59 PM, Andrew Kerber wrote:
> > I have been setting up a cluster on virtual machines with some 
> shared
> > resources.  The only fencing tool I have found designed for that
> > configuration is fence virsh, but I have not been able to figure out
> > from the documentation how to get fence-virsh to issue the reboot
> > command.  Does anyone have a good explanation of how to configure
> > fence-virsh to issue a reboot command?  I understand its not 
> perfect,
> > because in some hard lockup situations only hitting a power button 
> will
> > work, but for this configuration thats not really an option.
> >
> > --
> > Andrew W. Kerber
> 
> fence_virsh -a  -l root -p
> 
> -n  -o status
> 
> That should show the status. To reboot, change 'status' to 'reboot'.
> 
> If this doesn't work, make sure you can ssh from the nodes to the
> hypervisor as the root user.
> 
> --
> Digimer
> Papers and Projects: https://alteeve.com/w/
> "I am, somehow, less interested in the weight and convolutions of
> Einstein’s brain than in the near certainty that people of equal
> talent
> have lived and died in cotton fields and sweatshops." - Stephen
> Jay Gould

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] question about fence-virsh

2017-05-19 Thread Andrew Kerber
Thanks for the answer, but thats not the problem.  I dont have access to
the console, its a security issue.  I only have access within the virtual
machines, so I want to send the reboot command within the virtual machine,
not to the console. Typically our hangups are such that the reboot command
works, and the machine hangs at starting back up, and I get an admin to go
hit the console.

On Fri, May 19, 2017 at 12:39 PM, Digimer  wrote:

> On 19/05/17 12:59 PM, Andrew Kerber wrote:
> > I have been setting up a cluster on virtual machines with some shared
> > resources.  The only fencing tool I have found designed for that
> > configuration is fence virsh, but I have not been able to figure out
> > from the documentation how to get fence-virsh to issue the reboot
> > command.  Does anyone have a good explanation of how to configure
> > fence-virsh to issue a reboot command?  I understand its not perfect,
> > because in some hard lockup situations only hitting a power button will
> > work, but for this configuration thats not really an option.
> >
> > --
> > Andrew W. Kerber
>
> fence_virsh -a  -l root -p 
> -n  -o status
>
> That should show the status. To reboot, change 'status' to 'reboot'.
>
> If this doesn't work, make sure you can ssh from the nodes to the
> hypervisor as the root user.
>
> --
> Digimer
> Papers and Projects: https://alteeve.com/w/
> "I am, somehow, less interested in the weight and convolutions of
> Einstein’s brain than in the near certainty that people of equal talent
> have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
>



-- 
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] question about fence-virsh

2017-05-19 Thread Digimer
On 19/05/17 12:59 PM, Andrew Kerber wrote:
> I have been setting up a cluster on virtual machines with some shared
> resources.  The only fencing tool I have found designed for that
> configuration is fence virsh, but I have not been able to figure out
> from the documentation how to get fence-virsh to issue the reboot
> command.  Does anyone have a good explanation of how to configure
> fence-virsh to issue a reboot command?  I understand its not perfect,
> because in some hard lockup situations only hitting a power button will
> work, but for this configuration thats not really an option.
> 
> -- 
> Andrew W. Kerber

fence_virsh -a  -l root -p 
-n  -o status

That should show the status. To reboot, change 'status' to 'reboot'.

If this doesn't work, make sure you can ssh from the nodes to the
hypervisor as the root user.

-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] question about fence-virsh

2017-05-19 Thread Andrew Kerber
I have been setting up a cluster on virtual machines with some shared
resources.  The only fencing tool I have found designed for that
configuration is fence virsh, but I have not been able to figure out from
the documentation how to get fence-virsh to issue the reboot command.  Does
anyone have a good explanation of how to configure fence-virsh to issue a
reboot command?  I understand its not perfect, because in some hard lockup
situations only hitting a power button will work, but for this
configuration thats not really an option.

-- 
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org