Hi,

Any clue about how to fix this issue?

Thanks a lot

El vie., 8 mar. 2019 a las 22:08, Oscar Segarra (<[email protected]>)
escribió:

> Hi,
>
> I have tried executing the monit process as
>
> monit -vvI
>
> And I get the following messages:
>
> Cannot read proc file '/proc/33989/attr/current' -- Invalid argument
> Cannot read proc file '/proc/41/attr/current' -- Invalid argument
> Cannot read proc file '/proc/42/attr/current' -- Invalid argument
> Cannot read proc file '/proc/43/attr/current' -- Invalid argument
> Cannot read proc file '/proc/44/attr/current' -- Invalid argument
> Cannot read proc file '/proc/45/attr/current' -- Invalid argument
> Cannot read proc file '/proc/47/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5029/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5032/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5034/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5036/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5037/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5038/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5039/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5041/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5094/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5104/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5105/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5180/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5181/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5185/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5192/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6/attr/current' -- Invalid argument
> Cannot read proc file '/proc/60/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6032/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6035/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6043/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6046/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6057/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6059/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6069/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6082/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6931/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7113/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7114/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7116/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7118/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7127/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7196/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7441/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7450/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7451/attr/current' -- Invalid argument
> Cannot read proc file '/proc/8/attr/current' -- Invalid argument
> Cannot read proc file '/proc/8450/attr/current' -- Invalid argument
> Cannot read proc file '/proc/8776/attr/current' -- Invalid argument
> Cannot read proc file '/proc/9/attr/current' -- Invalid argument
> Cannot read proc file '/proc/91/attr/current' -- Invalid argument
> 'check_cephfs' stop on user request
> Monit daemon with PID 33978 awakened
>
> And of course, monit gets stuck in stop pending:
>
> Filesystem 'check_cephfs'
>   status                       *OK - stop pending*
>   monitoring status            Monitored
>   monitoring mode              active
>   on reboot                    start
>   filesystem type              ceph
>   filesystem flags
>  rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216
>   permission                   755
>   uid                          27
>   gid                          27
>   block size                   4 MB
>   space total                  224 MB (of which 0.0% is reserved for root
> user)
>   space free for non superuser 60 MB [26.8%]
>   space free total             60 MB [26.8%]
>   inodes total                 165
>   inodes free                  -1 [-0.6%]
>   data collected               Fri, 08 Mar 2019 21:59:26
>
> System 'vdicnode04'
>   status                       OK
>   monitoring status            Monitored
>   monitoring mode              active
>   on reboot                    start
>   load average                 [0.21] [0.25] [0.58]
>   cpu                          0.5%us 0.7%sy 0.0%wa
>   memory usage                 706.1 MB [38.8%]
>   swap usage                   264 kB [0.0%]
>   uptime                       27m
>   boot time                    Fri, 08 Mar 2019 21:39:18
>   data collected               Fri, 08 Mar 2019 21:59:26
>
> I don't know if those "Cannot read proc file" can be the problem of the
> eternal "stop pending"
>
> Thanks a lot in advance to everybody,
> Óscar
>
> El vie., 8 mar. 2019 a las 12:43, Oscar Segarra (<[email protected]>)
> escribió:
>
>> Hi Paul,
>>
>> The problem is not starting or stopping ceph server modules. I have the
>> problem in the client side where I want to be able to poweroff my client
>> machine even when the cephfs servers is not available.
>>
>> Thanks a lot
>>
>> El vie., 8 mar. 2019 1:49, Paul Theodoropoulos <[email protected]>
>> escribió:
>>
>>> I've zero experience with ceph, however -
>>>
>>> What about just incorporating ceph's status-checking facilities as the
>>> trigger, instead of watching the mount? for example
>>>
>>> monit monitor:
>>>
>>> check program ceph-status with path /usr/local/bin/ceph-status.sh
>>> start program = "/bin/systemctl start ceph.target"
>>> stop  program = "/bin/systemctl stop ceph\*.service ceph\*.target"
>>> if status != 0 then restart
>>>
>>> ceph-status.sh:
>>>
>>> #!/bin/bash
>>> ceph status >/dev/null 2>&1
>>>
>>> As I said, no experience with ceph, just had a quick look at some of the
>>> documentation - I could be completely wrong about the feasability of this...
>>>
>>>
>>> On 3/7/19 15:46, Oscar Segarra wrote:
>>>
>>> Hi Martin,
>>>
>>> Thanks a lot for your quick response.
>>>
>>> I have been making some tests but it looks your approach does not work
>>> at all:
>>>
>>> This is my simple configuration:
>>>
>>> cat << EOT > /etc/monit.d/monit_vdicube
>>> check filesystem check_cephfs with path /mnt/vdicube_ceph_fs
>>>     start program  = "/bin/mount -t ceph -o name=admin -o
>>> secret=AQDenzBcEyQ8BBAABQjoGn3DTnKN2v5hZm7gMw== 192.168.100.104:6789,
>>> 192.168.100.105:6789,192.168.100.106:6789:/ /mnt/vdicube_ceph_fs"
>>>     stop program   = "/bin/umount -f -l /mnt/vdicube_ceph_fs"
>>>     IF CHANGED FSFLAGS THEN start
>>> EOT
>>>
>>> In this case when ceph monitors servers (192.168.100.104:6789,
>>> 192.168.100.105:6789,192.168.100.106:6789) everything works fine.
>>> Start, stop, restart works great.
>>>
>>> Nevertheless, If I loose connectivity with ceph servers (I manually stop
>>> them) the monit service doesn't find out and continues showing "Ok" when,
>>> of course, none of the internal data can be acceeded. This can be normal
>>> because the mount instruction is still there:
>>>
>>> [root@vdicnode04 mnt]# mount | grep ceph
>>> 192.168.100.104:6789,192.168.100.105:6789,192.168.100.106:6789:/ on
>>> /mnt/vdicube_ceph_fs type ceph
>>> (rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216)
>>>
>>> In this scenario, If I execute the stop command as root manually from
>>> command line:
>>>
>>> /bin/umount -f -l /mnt/vdicube_ceph_fs
>>>
>>> It umounts de FS immediately, however, If I stop it using the monit CLI:
>>>
>>> [root@vdicnode04 /]# monit stop check_cephfs
>>> [root@vdicnode04 /]# monit status
>>> Monit 5.25.1 uptime: 4m
>>>
>>> Filesystem 'check_cephfs'
>>>   status                       OK - stop pending
>>>   monitoring status            Monitored
>>>   monitoring mode              active
>>>   on reboot                    start
>>>   filesystem type              ceph
>>>   filesystem flags
>>>  rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216
>>>   permission                   755
>>>   uid                          27
>>>   gid                          27
>>>   block size                   4 MB
>>>   space total                  228 MB (of which 0.0% is reserved for
>>> root user)
>>>   space free for non superuser 64 MB [28.1%]
>>>   space free total             64 MB [28.1%]
>>>   inodes total                 165
>>>   inodes free                  -1 [-0.6%]
>>>   data collected               Fri, 08 Mar 2019 00:35:28
>>>
>>> System 'vdicnode04'
>>>   status                       OK
>>>   monitoring status            Monitored
>>>   monitoring mode              active
>>>   on reboot                    start
>>>   load average                 [0.02] [0.20] [0.21]
>>>   cpu                          1.2%us 1.0%sy 0.0%wa
>>>   memory usage                 514.8 MB [28.3%]
>>>   swap usage                   0 B [0.0%]
>>>   uptime                       59m
>>>   boot time                    Thu, 07 Mar 2019 23:40:21
>>>   data collected               Fri, 08 Mar 2019 00:35:28
>>>
>>> [root@vdicnode04 /]#
>>>
>>> It gets stuck  in the "stop pending" status.
>>>
>>> In logs I can see the following:
>>>
>>> [CET Mar  8 00:39:55] info     : 'check_cephfs' stop on user request
>>> [CET Mar  8 00:39:55] info     : Monit daemon with PID 121791 awakened
>>>
>>> Of course, mount is still there until I execute manually the umount
>>> command:
>>>
>>> [root@vdicnode04 /]# mount | grep ceph
>>> 192.168.100.104:6789,192.168.100.105:6789,192.168.100.106:6789:/ on
>>> /mnt/vdicube_ceph_fs type ceph
>>> (rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216)
>>> [root@vdicnode04 /]# umount -f -l /mnt/vdicube_ceph_fs
>>> [root@vdicnode04 /]# mount | grep ceph
>>> [root@vdicnode04 /]#
>>>
>>> Even in this situation, monit status is still "stop pending"
>>>
>>> [root@vdicnode04 /]# monit status
>>> Monit 5.25.1 uptime: 4m
>>>
>>> Filesystem 'check_cephfs'
>>>   status                       OK - stop pending
>>>   monitoring status            Monitored
>>>
>>> Any help will be welcome!
>>>
>>> Óscar.
>>>
>>>
>>> El jue., 7 mar. 2019 a las 22:06, [email protected] (<
>>> [email protected]>) escribió:
>>>
>>>> Hi,
>>>>
>>>> we didn't test with ceph, you can try generic configuration, for
>>>> example:
>>>>
>>>>         check filesystem myfs with path /mydata
>>>>                 start program = ...    #note: set the start command
>>>> (mount)
>>>>                 stop program = ...      #note: set the stop command
>>>> (umount)
>>>>
>>>> It is possible that monit won't be able to collect I/O statistics ...
>>>> in that case we can implement support for ceph.
>>>>
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>> > On 7 Mar 2019, at 15:55, Oscar Segarra <[email protected]>
>>>> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> > I'd like to mount a cephfs filesystem when it is available (just
>>>> checking ceph metadata server tcp port).
>>>> >
>>>> > And, on poweroff the server , i'd like to force umount the previous
>>>> cephfs volume if it is already mounted. This is because if ceph metadata
>>>> server is not available, the server loops infinitely trying to umount the
>>>> cephfs mount point.
>>>> >
>>>> > Can theese two use cases be implemented with monit?
>>>> >
>>>> > Thanks a lot in advance
>>>> > Óscar
>>>> > --
>>>> > To unsubscribe:
>>>> > https://lists.nongnu.org/mailman/listinfo/monit-general
>>>>
>>>>
>>>> --
>>>> To unsubscribe:
>>>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>>
>>>
>>>
>>> --
>>> Paul Theodoropouloswww.anastrophe.com
>>>
>>> --
>>> To unsubscribe:
>>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>
>>
-- 
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Reply via email to