We added some mitigation for filesystem hangs. The node_exporter will
notice a stuck filesystem and stop attempting to gather metrics from it
until it gets un-stuck. Although, I don't think we have any metrics for
when that happens, only log errors.

On Tue, Mar 3, 2020 at 6:03 PM Serkan Çoban <[email protected]> wrote:

> if I remember correctly node exporter will hang too when an nfs share
> hangs. maybe you can test it...
>
> On Tue, Mar 3, 2020 at 6:26 PM Yagyansh S. Kumar
> <[email protected]> wrote:
> >
> > I also thought about doing the same, but I am keeping that as a last
> resort because that would require me to push the script to all my 2500+
> servers.
> >
> > On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna
> Kanagala wrote:
> >>
> >> I would write a small shell script that tries to write to the nfs
> mount  path and writes the status to a file which can be read by the text
> file collector. And schedule that shell script cron. I think this is the
> easiest solution.
> >>
> >> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar <[email protected]>
> wrote:
> >>>
> >>> Already enabled the nfs and nfsd collectors. Till now I haven't found
> anything that can accurately give me the information about NFS hang.
> >>> Correct me if I am wrong, but I don't think it is a good indicator of
> NFS hang as there may be times where no activity is happening on the NFS,
> but that does not mean that NFS is hanged. (eg. I have 25 NFS mounts on one
> of my servers, some of them are used rarely, so we won't find any
> substantial IO on those mounts, but I need to know whether they are
> accessible or not). Still, thanks for the suggestion, will try it out once.
> >>>
> >>>
> >>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna
> Kanagala wrote:
> >>>>
> >>>> Try enabling the nfs options in the node exporter config. It will
> spit out some metrics about the nfs status.
> >>>>
> >>>> Also look at the disk IO metrics from node exporter and if you see no
> activity which indicates the nfs is not doing anything.
> >>>>
> >>>> On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar <[email protected]>
> wrote:
> >>>>>
> >>>>> I want to check if the NFS is hanged(i.e whether it is accessible
> from the server or not, and if yes then what is the response time it is
> getting). I know using the mountstats and nfs collector we have a lot of
> metrics for NFS, but haven't found any that can tell me every time the NFS
> hangs correctly.
> >>>>> Thanks in advance.
> >>>>>
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> >>>>> To unsubscribe from this group and stop receiving emails from it,
> send an email to [email protected].
> >>>>> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
> .
> >>>
> >>> --
> >>> You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected].
> >>> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
> .
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected].
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/832f2823-eab1-4f40-8f91-ddbc00190551%40googlegroups.com
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAP9WWed%2BtxJVRSJc0mkCOkg6_neGAJRNEMq_hku87LPbYXAhjA%40mail.gmail.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmqMKQXYNOfdr7BeFA%3Dx%3D5fY%2Bk4EQ8oprL0Wh-8SNqmvoA%40mail.gmail.com.

Reply via email to