On 2018-01-14 12:48 PM, Digimer wrote: > On 2018-01-14 12:29 PM, Digimer wrote: >> I recently changed the host name of a cluster. It may or may not be >> related, but after I noticed that I can cleanly start gfs2 when the node >> boots. However, if the node is withdrawn and then I try to rejoin it >> without a reboot, it hangs with this in syslog; >>
A bit more info... I tried reformatting the GFS2 partition, and the problem remained. Then I tried zero'ing out the LV, and it was writing at about 4 KB/sec (dd if=/dev/zero of=/dev/vg/lv bs=1M oflag=dsync, run for about two minutes). When I saw how slow it was going, I decided to delete the LV (original was 40 GiB) and created a new LV (20 GiB). Then I formatted the new LV GFS2 and the problem remained. After this, I could no longer mount the GFS2 partition at all. Even on a fresh boot of both nodes, it immediately hung. At this point, I was losing my maintenance window, so I migrated the hosted servers to other Anvil! systems and the rebuilt the cluster. After the rebuild, the GFS2 partition mounted properly. A frightening experience, to be sure. Thankfully I had enough overhead on other systems to avoid an interruption. That said, I would love some input on why this might have happened. cheers, digimer -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould _______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
