Hi Marc,
I've seen systemd be overly helpful (read: not at all helpful) when it
observes state changing outside of its control. There was a bug I
encountered with GPFS (although the real issue may have been systemd,
but the fix was put into GPFS) by which GPFS filesystems would get
unmounted a split second after they were mounted, by systemd. The fs
would mount but systemd decided the /dev/$fs device wasn't "ready" so it
helpfully unmounted the filesystem. I don't know much about systemd
(avoiding it) but based on my experience with it I could certainly see a
case where systemd may actively kill the sdrserv process shortly after
it's started by the mm* commands if systemd doesn't expect it to be running.
I'd be curious to see the output of /var/adm/ras/mmsdrserv.log from the
manager nodes to see if sdrserv is indeed starting but getting harpooned
by systemd.
-Aaron
On 7/28/16 4:16 PM, Marc A Kaplan wrote:
Allow me to restate and demonstrate:
Even if systemd or any explicit kill signals destroy any/all running
mmcr* and mmsdr* processes,
simply running mmlsconfig will fire up new mmcr* and mmsdr* processes.
For example:
## I used kill -9 to kill all mmccr, mmsdr, lxtrace, ... processes
[root@n2 gpfs-git]# ps auwx | grep mm
root 9891 0.0 0.0 112640 980 pts/1 S+ 12:57 0:00 grep
--color=auto mm
[root@n2 gpfs-git]# mmlsconfig
Configuration data for cluster madagascar.frozen:
-------------------------------------------------
clusterName madagascar.frozen
...
worker1Threads 1022
adminMode central
File systems in cluster madagascar.frozen:
------------------------------------------
/dev/mak
/dev/x1
/dev/yy
/dev/zz
## mmlsconfig "needs" ccr and sdrserv, so if it doesn't see them, it
restarts them!
[root@n2 gpfs-git]# ps auwx | grep mm
root 9929 0.0 0.0 114376 1696 pts/1 S 12:58 0:00
/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
root 10110 0.0 0.0 20536 128 ? Ss 12:58 0:00
/usr/lpp/mmfs/bin/lxtrace-3.10.0-123.el7.x86_64 on /tmp/mmfs/lxtrac
root 10125 0.0 0.0 493264 11064 ? Ssl 12:58 0:00
/usr/lpp/mmfs/bin/mmsdrserv 1191 10 10 /var/adm/ras/mmsdrserv.log 1
root 10358 0.0 0.0 1700488 17636 ? Sl 12:58 0:00 python
/usr/lpp/mmfs/bin/mmsysmon.py
root 10440 0.0 0.0 114376 804 pts/1 S 12:59 0:00
/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
root 10442 0.0 0.0 112640 976 pts/1 S+ 12:59 0:00 grep
--color=auto mm
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss