On Thu, 02 Feb 2017 18:28:22 +0100, "Olaf Weiser" said: > but the /var/mmfs DIR is obviously damaged/empty .. what ever.. that's why you > see a message like this.. > have you reinstalled that node / any backup/restore thing ?
The internal RAID controller died a horrid death and basically took all the OS partitions with it. So the node was just sort of limping along, where the mmfsd process was still coping because it wasn't doing any I/O to the OS partitions - but 'ssh bad-node mmshutdown' wouldn't work because that requires accessing stuff in /var. At that point, it starts getting tempting to just use ipmitool from another node to power the comatose one down - but that often causes a cascade of other issues while things are stuck waiting for timeouts. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
