Hi Andrew, all, I'm continuing experiments with lustre on stacked drbd, and see following problem:
I have one drbd resource (ms-drbd-testfs-mdt0000) is stacked on top of other (ms-drbd-testfs-mdt0000-left), and have following constraints between them: colocation drbd-testfs-mdt0000-with-drbd-testfs-mdt0000-left inf: ms-drbd-testfs-mdt0000 ms-drbd-testfs-mdt0000-left:Master order drbd-testfs-mdt0000-after-drbd-testfs-mdt0000-left inf: ms-drbd-testfs-mdt0000-left:promote ms-drbd-testfs-mdt0000:start Then I have filesystem mounted on top of ms-drbd-testfs-mdt0000 (testfs-mdt0000 resource). colocation testfs-mdt0000-with-drbd-testfs-mdt0000 inf: testfs-mdt0000 ms-drbd-testfs-mdt0000:Master order testfs-mdt0000-after-drbd-testfs-mdt0000 inf: ms-drbd-testfs-mdt0000:promote testfs-mdt0000:start When I trigger event which causes many resources to stop (including these three), LogActions output look like: LogActions: Stop drbd-local#011(lustre01-left) LogActions: Stop drbd-stacked#011(Started lustre02-left) LogActions: Stop drbd-testfs-local#011(Started lustre03-left) LogActions: Stop drbd-testfs-stacked#011(Started lustre04-left) LogActions: Stop lustre#011(Started lustre04-left) LogActions: Stop mgs#011(Started lustre01-left) LogActions: Stop testfs#011(Started lustre03-left) LogActions: Stop testfs-mdt0000#011(Started lustre01-left) LogActions: Stop testfs-ost0000#011(Started lustre01-left) LogActions: Stop testfs-ost0001#011(Started lustre02-left) LogActions: Stop testfs-ost0002#011(Started lustre03-left) LogActions: Stop testfs-ost0003#011(Started lustre04-left) LogActions: Stop drbd-mgs:0#011(Master lustre01-left) LogActions: Stop drbd-mgs:1#011(Slave lustre02-left) LogActions: Stop drbd-testfs-mdt0000:0#011(Master lustre01-left) LogActions: Stop drbd-testfs-mdt0000-left:0#011(Master lustre01-left) LogActions: Stop drbd-testfs-mdt0000-left:1#011(Slave lustre02-left) LogActions: Stop drbd-testfs-ost0000:0#011(Master lustre01-left) LogActions: Stop drbd-testfs-ost0000-left:0#011(Master lustre01-left) LogActions: Stop drbd-testfs-ost0000-left:1#011(Slave lustre02-left) LogActions: Stop drbd-testfs-ost0001:0#011(Master lustre02-left) LogActions: Stop drbd-testfs-ost0001-left:0#011(Master lustre02-left) LogActions: Stop drbd-testfs-ost0001-left:1#011(Slave lustre01-left) LogActions: Stop drbd-testfs-ost0002:0#011(Master lustre03-left) LogActions: Stop drbd-testfs-ost0002-left:0#011(Master lustre03-left) LogActions: Stop drbd-testfs-ost0002-left:1#011(Slave lustre04-left) LogActions: Stop drbd-testfs-ost0003:0#011(Master lustre04-left) LogActions: Stop drbd-testfs-ost0003-left:0#011(Master lustre04-left) LogActions: Stop drbd-testfs-ost0003-left:1#011(Slave lustre03-left) For some reason demote is not run on both mdt drbd esources (should it?), so drbd RA prints warning about that. What I see then is that ms-drbd-testfs-mdt0000-left is tried to stop before ms-drbd-testfs-mdt0000. More, testfs-mdt0000 filesystem resource is not stopped before stopping drbd-testfs-mdt0000. I have advisory ordering constraints between mdt and ost filesystem resources, so all ost's are stopped before mdt. Thus mdt stop is delayed a bit. May be this influences what happens. I'm pretty sure I have correct constraints for at least these three resources, so it looks like a bug, because mandatory ordering is not preserved. I can produce report for this. Best, Vladislav _______________________________________________ Pacemaker mailing list: [email protected] http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
