We noticed some odd behavior recently.  I have a customer with a small Scale 
(with Archive on top) configuration that we recently updated to a dual node 
configuration.  We are using CES and setup a very small 3 nsd shared-root 
filesystem(gpfssr).  We also set up tiebreaker disks and figured it would be ok 
to use the gpfssr NSDs for this purpose.


When we tried to perform some basic failover testing, both nodes came down.  It 
appears from the logs that when we initiated the node failure (via mmshutdown 
command...not great, I know) it unmounts and remounts the shared-root 
filesystem.  When it did this, the cluster lost access to the tiebreaker disks, 
figured it had lost quorum and the other node came down as well.


We got around this by changing the tiebreaker disks to our other normal gpfs 
filesystem.  After that failover worked as expected.  This is documented 
nowhere as far as I could find?.  I wanted to know if anybody else had 
experienced this and if this is expected behavior.  All is well now and 
operating as we want so I don't think we'll pursue a support request.


Regards,

SHAUN ANDERSON
STORAGE ARCHITECT
O 208.577.2112
M 214.263.7014


NOTICE:  This email message and any attachments here to may contain confidential
information.  Any unauthorized review, use, disclosure, or distribution of such
information is prohibited.  If you are not the intended recipient, please 
contact
the sender by reply email and destroy the original message and all copies of it.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to