This seems like a lot of work, especially when I have to monitor over 2500+ 
servers. :P 

On Tuesday, March 3, 2020 at 10:49:57 PM UTC+5:30, sayf eddine Hammemi 
wrote:
>
> If the node-exporter will log errors if the nfs share hangs then u can use 
> mtail for example to scrape node exporter log files and export nfs errors, 
> that would be better than using a hand made script.
>
> On Tue, Mar 3, 2020, 18:12 Ben Kochie <[email protected] <javascript:>> 
> wrote:
>
>> We added some mitigation for filesystem hangs. The node_exporter will 
>> notice a stuck filesystem and stop attempting to gather metrics from it 
>> until it gets un-stuck. Although, I don't think we have any metrics for 
>> when that happens, only log errors.
>>
>> On Tue, Mar 3, 2020 at 6:03 PM Serkan Çoban <[email protected] 
>> <javascript:>> wrote:
>>
>>> if I remember correctly node exporter will hang too when an nfs share
>>> hangs. maybe you can test it...
>>>
>>> On Tue, Mar 3, 2020 at 6:26 PM Yagyansh S. Kumar
>>> <[email protected] <javascript:>> wrote:
>>> >
>>> > I also thought about doing the same, but I am keeping that as a last 
>>> resort because that would require me to push the script to all my 2500+ 
>>> servers.
>>> >
>>> > On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna 
>>> Kanagala wrote:
>>> >>
>>> >> I would write a small shell script that tries to write to the nfs 
>>> mount  path and writes the status to a file which can be read by the text 
>>> file collector. And schedule that shell script cron. I think this is the 
>>> easiest solution.
>>> >>
>>> >> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar <[email protected]> 
>>> wrote:
>>> >>>
>>> >>> Already enabled the nfs and nfsd collectors. Till now I haven't 
>>> found anything that can accurately give me the information about NFS hang.
>>> >>> Correct me if I am wrong, but I don't think it is a good indicator 
>>> of NFS hang as there may be times where no activity is happening on the 
>>> NFS, but that does not mean that NFS is hanged. (eg. I have 25 NFS mounts 
>>> on one of my servers, some of them are used rarely, so we won't find any 
>>> substantial IO on those mounts, but I need to know whether they are 
>>> accessible or not). Still, thanks for the suggestion, will try it out once.
>>> >>>
>>> >>>
>>> >>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna 
>>> Kanagala wrote:
>>> >>>>
>>> >>>> Try enabling the nfs options in the node exporter config. It will 
>>> spit out some metrics about the nfs status.
>>> >>>>
>>> >>>> Also look at the disk IO metrics from node exporter and if you see 
>>> no activity which indicates the nfs is not doing anything.
>>> >>>>
>>> >>>> On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar <
>>> [email protected]> wrote:
>>> >>>>>
>>> >>>>> I want to check if the NFS is hanged(i.e whether it is accessible 
>>> from the server or not, and if yes then what is the response time it is 
>>> getting). I know using the mountstats and nfs collector we have a lot of 
>>> metrics for NFS, but haven't found any that can tell me every time the NFS 
>>> hangs correctly.
>>> >>>>> Thanks in advance.
>>> >>>>>
>>> >>>>> --
>>> >>>>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> >>>>> To unsubscribe from this group and stop receiving emails from it, 
>>> send an email to [email protected].
>>> >>>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
>>> .
>>> >>>
>>> >>> --
>>> >>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> >>> To unsubscribe from this group and stop receiving emails from it, 
>>> send an email to [email protected].
>>> >>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
>>> .
>>> >
>>> > --
>>> > You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected] <javascript:>.
>>> > To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/832f2823-eab1-4f40-8f91-ddbc00190551%40googlegroups.com
>>> .
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected] <javascript:>.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/CAP9WWed%2BtxJVRSJc0mkCOkg6_neGAJRNEMq_hku87LPbYXAhjA%40mail.gmail.com
>>> .
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/CABbyFmqMKQXYNOfdr7BeFA%3Dx%3D5fY%2Bk4EQ8oprL0Wh-8SNqmvoA%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/CABbyFmqMKQXYNOfdr7BeFA%3Dx%3D5fY%2Bk4EQ8oprL0Wh-8SNqmvoA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/cb697139-7540-4a52-86f2-3ad04d242c68%40googlegroups.com.

Reply via email to