Infrastructure information:

helm chart: "stable/prometheus"
version: 10.0.1
appVersion: 2.15.2
Aws EKS with Kubernetes 1.14

*Problem:*
The Prometheus server queries result on the Grafana side is not consistent 
as it should be when we have HA of Prometheus server. So I want to achieve 
consistent storage access with High Availablity.

*Scenario-1*
RelicaCount=1
statefulSet=false
Created PersitantVolume Count = 1
The application works as expected but not HA. 

*Scenerio-2*
ReplicaCount=3
statefulSet=false
Created PersistantVolume Count=1
In this scenario, only one pod is running and other ones are throwing lock 
error from the Prometheus server. 

*Scenario-2.1*

ReplicaCount=3
statefulSet=false
Created PersistantVolume Count=1
with "--storage.tsdb.no-lockfile"

it does create 3 pods but except 1 all others throwing some golang error 
from Prometheus server application.

*Scenario-3*
ReplicaCount=3
statefulSet=true
Created PersistantVolume Count=3

So in this scenario, we have 3 replica pods with 3 separate persistent 
storage through the stateful set, Which is recommended configuration from 
the community. But this configuration not giving consistent metrics on 
Grafana as the session is sticky.

*Question: So how I can achieve HA of Prometheus server with the plan HPA 
or VPA with persistent storage on Kubernetes.*

*Resolution:  I am thinking about handling this problem with 
below-mentioned resolution:*

   1. use deployment replica for pods with single persistent storage but 
   it's not working, or maybe my configuration is in current to handle this.
   2. expose the Prometheus server with an application load balancer with 
   sticky session and use ALB URL in Grafana to get persistent query results. 
   but still, I believe it would behave differently for different pods 
   requests.


Has anyone faced this issue before, if yes how you overcome with HA and 
persistent storage on Kubernetes?
If my configuration is incorrect to handle this requirement, Can you 
provide recommended approach and configuration to handle this?

Please feel to ask if anything missing to explain in my current 
implementation, Thanks in advance. 
 

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/50a46d49-8dc4-42aa-9cc3-bd3addd66972%40googlegroups.com.

Reply via email to