Re: [prometheus-users] Replicas more than 2 corrupting the wal directory

Stuart Clark Tue, 22 Dec 2020 00:36:52 -0800

On 22/12/2020 04:52, Mohan Nagandlla wrote:

HI team I am using the Prometheus instance having the more than 1replica, When replica as 1 there is no wal corruption in datadirectory and now for the sake of zero down time updates for instanceI make the replicas count as 2 the instance is up now but the walcorruptions are happening the logs are below|level=error ts=2020-12-22T04:36:00.860Z caller=scrape.go:1076component="scrape manager" scrape_pool=depl/node-exporter/0target=http://x.x.x.x:9100/metrics msg="Scrape commit failed"err="write to WAL: log samples: write /prometheus/wal/00000003: staleNFS file handle" level=error ts=2020-12-22T04:36:00.862Zcaller=scrape.go:1076 component="scrape manager"scrape_pool=depl/prometheus-kubelet/0target=https://x.x.x.x:10250/metrics msg="Scrape commit failed"err="write to WAL: log samples: write /prometheus/wal/00000003: staleNFS file handle" level=error ts=2020-12-22T04:36:00.881Zcaller=scrape.go:1076 component="scrape manager"scrape_pool=depl/prometheus-kubelet/0target=https://x.x.x.x:10250/metrics msg="Scrape commit failed"err="write to WAL: log samples: write /prometheus/wal/00000003: staleNFS file handle" level=error ts=2020-12-22T04:36:00.898Zcaller=scrape.go:1076 component="scrape manager"scrape_pool=depl/prometheus-kubelet/0target=https://x.x.x.x:10250/metrics msg="Scrape commit failed"err="write to WAL: log samples: write /prometheus/wal/00000003: staleNFS file handle" level=error ts=2020-12-22T04:36:00.970Zcaller=scrape.go:1076 component="scrape manager"scrape_pool=depl/node-exporter/0 target=http://x.x.x.x:9100/metricsmsg="Scrape commit failed" err="write to WAL: log samples: write/prometheus/wal/00000003: stale NFS file handle" |
Getting more logs like this at one replica no errors but when i amusing the more than 1 replica getting above errors.
Or is there any other way for prometheus zero down time and why does iam getting this errors but if i used the replicas as 1 there is noerrors in data directory this is happening more than 1 replica

Prometheus must not share a data directory with another runninginstance, as you will see data corruption. Each Prometheus instance musthave a unique data directory. Additionally NFS isn't supported, so youshould use a local hard drive/EBS volume.


--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/fb9bdbaa-dbd3-cf5a-3af8-1205b34dc05e%40Jahingo.com.

Re: [prometheus-users] Replicas more than 2 corrupting the wal directory

Reply via email to