On 8/10/2010 12:20 PM, David Noriega wrote:
Another question. Is it possible to use centos/redhat's clustering
software?
main issues, IMHO, are that lustre today use the physical hostname/ip for all MDS, OSS, MGS etc cluster SW use the VIP, so there are some work need to be done to make VIP work for lustre
my 2c

In the manual it mentions using that for metadata
failover(since having more then one metadata server online isnt
possible right now), so why not use that for all of lustre? But since
the information is missing, can someone fill in the blanks on setting
up metadata failover?

David

On Tue, Aug 10, 2010 at 11:11 AM, Kevin Van Maren
<[email protected]>  wrote:
Depends on the HA package you are using.  Heartbeat comes with a script that
supports IPMI.

The important thing is that stonith NOT succeed if you don't _know_ that the
node is off.
So it is absolutely not a 1-line script.

Kevin


David Noriega wrote:
I think I'll go the ipmi route. So reading on STONITH, its just a
script, so all I would need is a script to run ipmi that tells the
server to power off, right?

Also while reading through the lustre manual, seems some things are
being deleted from the wiki,
http://wiki.lustre.org/index.php?title=Clu_Manager no longer exists,
and noticed this too when I found the lustre quick guide is no longer
available.

Thanks
David

On Tue, Aug 10, 2010 at 10:57 AM, Kevin Van Maren
<[email protected]>  wrote:

David Noriega wrote:

Could you describe this resource fencing in more detail? As for
regards to STONITH, the pdu already has the grubby hands of IT plugged
into it and doubt they would be happy if I unplugged them.  What about
the network management port or ILOM?


Resource fencing is needed to ensure that a node does not take over a
resource (ie, OST)
while the other node is still accessing it (as could happen if the node
only
partly crashes,
where it is not responding to the HA package but still writing to the
disk).

STONITH is a pretty common way to ensure the other node is dead and can
no
longer
access the resource.  If you can't use your switched PDU, then using the
ILOM for IPMI-based
power control works.  The other common way to do resource fencing is to
use
scsi reserve
commands (if supported by the hardware and the HA package) to ensure
exclusive access.

Kevin


On Mon, Aug 9, 2010 at 1:08 PM, Kevin Van Maren
<[email protected]>  wrote:


On Aug 9, 2010, at 11:45 AM, David Noriega<[email protected]>  wrote:



My understanding of setting up fail-over is you need some control over
the power so with a script it can turn off a machine by cutting its
power? Is this correct?


It is the recommended configuration because it is simple to understand
and
implement.

But the only _hard_ requirement is that both nodes can access the
storage.




Is there a way to do fail-over without having
access to the pdu(power strips)?


If you have IPMI support, that can be used for power control, instead
of
a
switched PDU.  Depending on the storage, you may be able to do resource
fencing of the disks instead of STONITH.  Or you can run
fast-and-loose,
without any way to ensure the dead node is really "dead" and not
accessing
storage (at your risk).  While Lustre has MMP, it is really more to
protect
against a mount typo than to guarantee resource fencing.




Thanks
David

--
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss











<<attachment: laotsao.vcf>>

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to