Hi, I have tried executing the monit process as
monit -vvI And I get the following messages: Cannot read proc file '/proc/33989/attr/current' -- Invalid argument Cannot read proc file '/proc/41/attr/current' -- Invalid argument Cannot read proc file '/proc/42/attr/current' -- Invalid argument Cannot read proc file '/proc/43/attr/current' -- Invalid argument Cannot read proc file '/proc/44/attr/current' -- Invalid argument Cannot read proc file '/proc/45/attr/current' -- Invalid argument Cannot read proc file '/proc/47/attr/current' -- Invalid argument Cannot read proc file '/proc/5/attr/current' -- Invalid argument Cannot read proc file '/proc/5029/attr/current' -- Invalid argument Cannot read proc file '/proc/5032/attr/current' -- Invalid argument Cannot read proc file '/proc/5034/attr/current' -- Invalid argument Cannot read proc file '/proc/5036/attr/current' -- Invalid argument Cannot read proc file '/proc/5037/attr/current' -- Invalid argument Cannot read proc file '/proc/5038/attr/current' -- Invalid argument Cannot read proc file '/proc/5039/attr/current' -- Invalid argument Cannot read proc file '/proc/5041/attr/current' -- Invalid argument Cannot read proc file '/proc/5094/attr/current' -- Invalid argument Cannot read proc file '/proc/5104/attr/current' -- Invalid argument Cannot read proc file '/proc/5105/attr/current' -- Invalid argument Cannot read proc file '/proc/5180/attr/current' -- Invalid argument Cannot read proc file '/proc/5181/attr/current' -- Invalid argument Cannot read proc file '/proc/5185/attr/current' -- Invalid argument Cannot read proc file '/proc/5192/attr/current' -- Invalid argument Cannot read proc file '/proc/6/attr/current' -- Invalid argument Cannot read proc file '/proc/60/attr/current' -- Invalid argument Cannot read proc file '/proc/6032/attr/current' -- Invalid argument Cannot read proc file '/proc/6035/attr/current' -- Invalid argument Cannot read proc file '/proc/6043/attr/current' -- Invalid argument Cannot read proc file '/proc/6046/attr/current' -- Invalid argument Cannot read proc file '/proc/6057/attr/current' -- Invalid argument Cannot read proc file '/proc/6059/attr/current' -- Invalid argument Cannot read proc file '/proc/6069/attr/current' -- Invalid argument Cannot read proc file '/proc/6082/attr/current' -- Invalid argument Cannot read proc file '/proc/6931/attr/current' -- Invalid argument Cannot read proc file '/proc/7/attr/current' -- Invalid argument Cannot read proc file '/proc/7113/attr/current' -- Invalid argument Cannot read proc file '/proc/7114/attr/current' -- Invalid argument Cannot read proc file '/proc/7116/attr/current' -- Invalid argument Cannot read proc file '/proc/7118/attr/current' -- Invalid argument Cannot read proc file '/proc/7127/attr/current' -- Invalid argument Cannot read proc file '/proc/7196/attr/current' -- Invalid argument Cannot read proc file '/proc/7441/attr/current' -- Invalid argument Cannot read proc file '/proc/7450/attr/current' -- Invalid argument Cannot read proc file '/proc/7451/attr/current' -- Invalid argument Cannot read proc file '/proc/8/attr/current' -- Invalid argument Cannot read proc file '/proc/8450/attr/current' -- Invalid argument Cannot read proc file '/proc/8776/attr/current' -- Invalid argument Cannot read proc file '/proc/9/attr/current' -- Invalid argument Cannot read proc file '/proc/91/attr/current' -- Invalid argument 'check_cephfs' stop on user request Monit daemon with PID 33978 awakened And of course, monit gets stuck in stop pending: Filesystem 'check_cephfs' status *OK - stop pending* monitoring status Monitored monitoring mode active on reboot start filesystem type ceph filesystem flags rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216 permission 755 uid 27 gid 27 block size 4 MB space total 224 MB (of which 0.0% is reserved for root user) space free for non superuser 60 MB [26.8%] space free total 60 MB [26.8%] inodes total 165 inodes free -1 [-0.6%] data collected Fri, 08 Mar 2019 21:59:26 System 'vdicnode04' status OK monitoring status Monitored monitoring mode active on reboot start load average [0.21] [0.25] [0.58] cpu 0.5%us 0.7%sy 0.0%wa memory usage 706.1 MB [38.8%] swap usage 264 kB [0.0%] uptime 27m boot time Fri, 08 Mar 2019 21:39:18 data collected Fri, 08 Mar 2019 21:59:26 I don't know if those "Cannot read proc file" can be the problem of the eternal "stop pending" Thanks a lot in advance to everybody, Óscar El vie., 8 mar. 2019 a las 12:43, Oscar Segarra (<[email protected]>) escribió: > Hi Paul, > > The problem is not starting or stopping ceph server modules. I have the > problem in the client side where I want to be able to poweroff my client > machine even when the cephfs servers is not available. > > Thanks a lot > > El vie., 8 mar. 2019 1:49, Paul Theodoropoulos <[email protected]> > escribió: > >> I've zero experience with ceph, however - >> >> What about just incorporating ceph's status-checking facilities as the >> trigger, instead of watching the mount? for example >> >> monit monitor: >> >> check program ceph-status with path /usr/local/bin/ceph-status.sh >> start program = "/bin/systemctl start ceph.target" >> stop program = "/bin/systemctl stop ceph\*.service ceph\*.target" >> if status != 0 then restart >> >> ceph-status.sh: >> >> #!/bin/bash >> ceph status >/dev/null 2>&1 >> >> As I said, no experience with ceph, just had a quick look at some of the >> documentation - I could be completely wrong about the feasability of this... >> >> >> On 3/7/19 15:46, Oscar Segarra wrote: >> >> Hi Martin, >> >> Thanks a lot for your quick response. >> >> I have been making some tests but it looks your approach does not work at >> all: >> >> This is my simple configuration: >> >> cat << EOT > /etc/monit.d/monit_vdicube >> check filesystem check_cephfs with path /mnt/vdicube_ceph_fs >> start program = "/bin/mount -t ceph -o name=admin -o >> secret=AQDenzBcEyQ8BBAABQjoGn3DTnKN2v5hZm7gMw== 192.168.100.104:6789, >> 192.168.100.105:6789,192.168.100.106:6789:/ /mnt/vdicube_ceph_fs" >> stop program = "/bin/umount -f -l /mnt/vdicube_ceph_fs" >> IF CHANGED FSFLAGS THEN start >> EOT >> >> In this case when ceph monitors servers (192.168.100.104:6789, >> 192.168.100.105:6789,192.168.100.106:6789) everything works fine. Start, >> stop, restart works great. >> >> Nevertheless, If I loose connectivity with ceph servers (I manually stop >> them) the monit service doesn't find out and continues showing "Ok" when, >> of course, none of the internal data can be acceeded. This can be normal >> because the mount instruction is still there: >> >> [root@vdicnode04 mnt]# mount | grep ceph >> 192.168.100.104:6789,192.168.100.105:6789,192.168.100.106:6789:/ on >> /mnt/vdicube_ceph_fs type ceph >> (rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216) >> >> In this scenario, If I execute the stop command as root manually from >> command line: >> >> /bin/umount -f -l /mnt/vdicube_ceph_fs >> >> It umounts de FS immediately, however, If I stop it using the monit CLI: >> >> [root@vdicnode04 /]# monit stop check_cephfs >> [root@vdicnode04 /]# monit status >> Monit 5.25.1 uptime: 4m >> >> Filesystem 'check_cephfs' >> status OK - stop pending >> monitoring status Monitored >> monitoring mode active >> on reboot start >> filesystem type ceph >> filesystem flags >> rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216 >> permission 755 >> uid 27 >> gid 27 >> block size 4 MB >> space total 228 MB (of which 0.0% is reserved for root >> user) >> space free for non superuser 64 MB [28.1%] >> space free total 64 MB [28.1%] >> inodes total 165 >> inodes free -1 [-0.6%] >> data collected Fri, 08 Mar 2019 00:35:28 >> >> System 'vdicnode04' >> status OK >> monitoring status Monitored >> monitoring mode active >> on reboot start >> load average [0.02] [0.20] [0.21] >> cpu 1.2%us 1.0%sy 0.0%wa >> memory usage 514.8 MB [28.3%] >> swap usage 0 B [0.0%] >> uptime 59m >> boot time Thu, 07 Mar 2019 23:40:21 >> data collected Fri, 08 Mar 2019 00:35:28 >> >> [root@vdicnode04 /]# >> >> It gets stuck in the "stop pending" status. >> >> In logs I can see the following: >> >> [CET Mar 8 00:39:55] info : 'check_cephfs' stop on user request >> [CET Mar 8 00:39:55] info : Monit daemon with PID 121791 awakened >> >> Of course, mount is still there until I execute manually the umount >> command: >> >> [root@vdicnode04 /]# mount | grep ceph >> 192.168.100.104:6789,192.168.100.105:6789,192.168.100.106:6789:/ on >> /mnt/vdicube_ceph_fs type ceph >> (rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216) >> [root@vdicnode04 /]# umount -f -l /mnt/vdicube_ceph_fs >> [root@vdicnode04 /]# mount | grep ceph >> [root@vdicnode04 /]# >> >> Even in this situation, monit status is still "stop pending" >> >> [root@vdicnode04 /]# monit status >> Monit 5.25.1 uptime: 4m >> >> Filesystem 'check_cephfs' >> status OK - stop pending >> monitoring status Monitored >> >> Any help will be welcome! >> >> Óscar. >> >> >> El jue., 7 mar. 2019 a las 22:06, [email protected] (< >> [email protected]>) escribió: >> >>> Hi, >>> >>> we didn't test with ceph, you can try generic configuration, for example: >>> >>> check filesystem myfs with path /mydata >>> start program = ... #note: set the start command >>> (mount) >>> stop program = ... #note: set the stop command >>> (umount) >>> >>> It is possible that monit won't be able to collect I/O statistics ... in >>> that case we can implement support for ceph. >>> >>> Best regards, >>> Martin >>> >>> >>> > On 7 Mar 2019, at 15:55, Oscar Segarra <[email protected]> >>> wrote: >>> > >>> > Hi, >>> > >>> > I'd like to mount a cephfs filesystem when it is available (just >>> checking ceph metadata server tcp port). >>> > >>> > And, on poweroff the server , i'd like to force umount the previous >>> cephfs volume if it is already mounted. This is because if ceph metadata >>> server is not available, the server loops infinitely trying to umount the >>> cephfs mount point. >>> > >>> > Can theese two use cases be implemented with monit? >>> > >>> > Thanks a lot in advance >>> > Óscar >>> > -- >>> > To unsubscribe: >>> > https://lists.nongnu.org/mailman/listinfo/monit-general >>> >>> >>> -- >>> To unsubscribe: >>> https://lists.nongnu.org/mailman/listinfo/monit-general >> >> >> >> -- >> Paul Theodoropouloswww.anastrophe.com >> >> -- >> To unsubscribe: >> https://lists.nongnu.org/mailman/listinfo/monit-general > >
-- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
