just FYI, there are few other hosts in this cluster, where node_exporter is running just fine without any issues. We have started the process using systemctl command, here is the service file:
# cat /etc/systemd/system/node_exporter.service [Unit] Description=Node Exporter [Service] User=prometheus ExecStart=/usr/local/bin/node_exporter --collector.filesystem --collector.netdev --collector.cpu --collector.diskstats --collector.mdadm --collector.loadavg --collector.time --collector.uname --collector.logind --collector.textfile.directory=/var/lib/node_exporter/textfile_collector --collector.systemd [Install] WantedBy=default.target [ Also here is the stack trace: [root@mesosagent13 ~]# cat /proc/53547/stack [<ffffffffb8c9e04b>] do_exit+0x6bb/0xa40 [<ffffffffb8c9e44f>] do_group_exit+0x3f/0xa0 [<ffffffffb8caf24e>] get_signal_to_deliver+0x1ce/0x5e0 [<ffffffffb8c2b527>] do_signal+0x57/0x6f0 [<ffffffffb8c2bc32>] do_notify_resume+0x72/0xc0 [<ffffffffb9375124>] int_signal+0x12/0x17 [<ffffffffffffffff>] 0xffffffffffffffff [root@mesosagent13 ~]# On Tue, Jul 21, 2020 at 12:48 PM Christian Hoffmann < [email protected]> wrote: > Hi, > > On 7/21/20 9:34 PM, Lakshman Savadamuthu wrote: > > Thanks for the reply Christian. > > Looks like the node_exporter is in defunct state, i can't even stop the > > process now. > > > > Here is the version: > > > > [root@mesosagent13 ~]# /usr/local/bin/node_exporter --version > > > > node_exporter, version 0.17.0 (branch: master, revision: > > 36e3b2a923e551830b583ecd43c8f9a9726576cf) > Meanwhile, the latest version is 1.0.1, so updating might be worth a try > (although I don't know of any fixes specific to your issue). > > > [root@mesosagent13 ~]# ps -aef | grep node_exporter > > > > root 8600 61971 0 12:31 pts/0 00:00:00 grep --color=auto > > *node_exporter* > > > > prometh+ 53547 1 20 Jun22 ? 6-02:57:16 [*node_exporter*] > > <defunct> > > > > [root@mesosagent13 ~]# > > > > Tried killing the process also using pkill -f option, that also didnt > help. > Hrm, this usually sounds like the process invoking node_exporter has not > recognized the exit properly yet. Is this from the start using systemd? > Can you share the unit file? > > Or is this from a manual start? Could it be that you had backgrounded > the process using "&" or using Ctrl+Z? If so, try foregrounding it (fg) > so that the shell can properly handle the exit. > > You can try to look at what this process was doing lastly by running > cat /proc/53547/stack > > But I suspect that it will not lead to anything useful. > > I think this may just be a dead process table entry. If nothing helps, > you could reboot. In any case, this shouldn't prevent you from running > further tests (e.g. it should not block the listening port or anything). > > Kind regards, > Christian > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAC0mKzcL9pD3iOByxWnJtu1_nRp6UVgqYpFYh%3Dazq7hWMsqBaQ%40mail.gmail.com.

