Oof, just noticed that the images do not load in some email clients 😬

The proposal can also be seen at this Pull Request
<https://github.com/prometheus-operator/prometheus-operator/pull/5497>,
with the images :)

Em sex., 14 de abr. de 2023 às 10:00, Arthur Silva Sens <
arthursens2...@gmail.com> escreveu:

> Hi everybody, I'm Arthur from the Prometheus-Operator team.
>
> We've recently added support for running Prometheus in Agent mode with
> Prometheus-Operator and we've started to brainstorm new Deployment Patterns
> that could be explored with the Agent, e.g. as Daemonsets or Sidecars.
>
> At this point in time, I'm drafting how things could look like if
> Prometheus Agent is run as Pod sidecars, and would love to know the opinion
> of the community about it. I'm particularly interested to know if there is
> an appetite from the community for such a deployment pattern and if you
> find new failure modes with that approach.
>
> Here is the proposal:
>
> Agent Deployment Pattern: Sidecar Injection
>
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#summary>
> Summary
>
> With Prometheus-Operator finally supporting running Prometheus in Agent
> mode, we can start thinking about different deployment patterns that can be
> explored with this minimal container. This document aims to continue the
> work started by this document
> <https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/designs/prometheus-agent.md>,
> focusing on exploring how Prometheus-Operator can leverage deploying
> PrometheusAgents as sidecars running alongside pods that a user wants to
> monitor.
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#background>
> Background
>
> By the time this document was written, Prometheus-Operator can deploy
> Prometheus in Agent mode, but only using a pattern similar to the original
> implementation of Prometheus Server: using StatefulSets. The original
> design document
> <https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/designs/prometheus-agent.md>
>  for
> Prometheus Agent already mentions that different deployment patterns are
> desired, however, for the sake of speeding up the initial implementation it
> was decided to re-use the logic and start with the Agent running as
> StatefulSets.
>
> Also for the sake of speeding up implementation, this document won't focus
> on several new Deployment patterns, but only one: Sidecar Injection.
>
> Looking at the traditional deployment model, we have a single Prometheus
> (or an HA setup) per cluster or namespace, responsible for scraping all
> containers under their scope. Prometheus operator relies on ServiceMonitor
> , PodMonitor, and Probe CRs to configure Prometheus, which will
> eventually use Kubernetes service-discovery to find endpoints that need to
> be scraped.
>
> Depending on the Cluster's scale and how often Prometheus hits Kubernetes
> API, Prometheus service discovery can increase the load on the API
> significantly and affect the overall functionality of said cluster.
>
> Another problem is that one or more containers can be updated to a
> problematic version that causes a Cardinality Spike
> <https://grafana.com/blog/2022/02/15/what-are-cardinality-spikes-and-why-do-they-matter/>.
> Depending on the proportion of the spike, it is possible that a container
> could single-handedly crash the monitoring system of the whole cluster.
>
> [image: Traditional Deployment Pattern]
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/assets/agent-deployment-pattern-sidecar/traditional-deployment-pattern.png>
> .
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#proposal>
> Proposal
>
> This document proposes a new deployment model where Prometheus-Operator
> injects Prometheus agents as a sidecar container (and Prometheus config
> reloader) to pods that needs to be scrapped. With a sidecar, we tackle both
> problems mentioned above:
>
>
>    - Load on Kubernetes API won't exist since it's not needed anymore.
>    Prometheus will scrape containers from the same pod through their shared
>    network interface and scrape configuration can be declared via pod
>    annotations.
>    - A sudden cardinality spike will not affect the whole monitoring
>    system. In a worst-case scenario, it will fail a single pod.
>
> A common pattern used with Prometheus's Kubernetes service discovery is
> the usage of annotation to declaratively tell Prometheus which endpoints
> need to be scraped
> <https://www.acagroup.be/en/blog/auto-discovery-of-kubernetes-endpoint-services-prometheus/>.
> From a code search at Github
> <https://github.com/search?q=prometheus.io%2Fscrape%3A+%22true%22&type=code>
>  for prometheus.io/scrape: "true", we can tell that this approach has
> good adoption already. To not conflict with the already commonly used
> annotation, we can start with our own, but with a very similar approach.
> apiVersion: v1
>   kind: Pod
>   metadata:
>     name: example
>     annotations:
>       prometheus.operator.io/scrape: "true"
>       prometheus.operator.io/path: "/metrics"
>       prometheus.operator.io/port: "8080"
>       prometheus.operator.io/scrape-interval: "60s"
> spec:
> ...
>
> The existing PrometheusAgent CRD would be extended with a new field called
> mode, which can be one of two values(for now): [statefulset, sidecar],
> with statefulset as default. If mode is set to sidecar,
> Prometheus-Operator won't deploy any Prometheus agents initially. Instead,
> it will watch for Pod updates and inject the Prometheus Agent as a sidecar
> with the pre-determined annotations present.
>
> In addition to telling the deployment model, the Agent CR will be the
> source of truth for remote-write configuration, such as URL and
> authentication. A change to the remote-write configuration would still
> require a hot reload of potentially millions of agent sidecar containers,
> but by avoiding having the remote-write configuration in pod annotation we
> at least avoid requiring that the Pod manifest also needs to be upgraded.
>
> If different sets of pods require different remote-write configurations,
> then multiple PrometheusAgent CRs are needed. This means that the pod also
> needs to specify which Agent CR will inject the sidecar:
> apiVersion: v1
>   kind: Pod
>   metadata:
>     name: example
>     annotations:
>       prometheus.operator.io/scrape: "true"
>       prometheus.operator.io/path: "/metrics"
>       prometheus.operator.io/port: "8080"
>       prometheus.operator.io/scrape-interval: "60s"
>       prometheus.operator.io/agent-selector: "monitoring/agent-example"
> spec:
> ...
> ---
>   apiVersion: monitoring.coreos.com/v1alpha1
>   kind: PrometheusAgent
>   metadata:
>     name: agent-example
>     namespace: monitoring
> spec:
>   mode: sidecar
>   remoteWrite:
>    - url: https://example.com
>
> With a visualization:
>
> [image: Sidecar Deployment Pattern]
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/assets/agent-deployment-pattern-sidecar/sidecar-deployment-pattern.png>
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#what-to-do-with-servicemonitor-podmonitor-and-probe-selectors>What
> to do with ServiceMonitor, PodMonitor, and Probe selectors?
>
> With the sidecar approach, our goal is to scale Prometheus horizontally
> while avoiding impact in the Kubernetes API. It wouldn't make sense for a
> sidecar to also scrape metrics from other pods.
>
> If mode is set to sidecar, a validating webhook would forbid
> PrometheusAgent CRs to be created/updated with the following fields:
>
>    - serviceMonitorSelector
>    - serviceMonitorNamespaceSelector
>    - podMonitorSelector
>    - podMonitorNamespaceSelector
>    - probeSelector
>    - probeNamespaceSelector
>
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#caveats>
> Caveats
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#config-hot-reload>Config
> Hot Reload
>
> There will be two ways to change Prometheus configuration now, 1) by
> changing annotation on the pod and 2) by changing the remote-write field in
> PrometheusAgent CRD. The first one will only trigger a hot reload for the
> involved pod, but the latter has the potential to trigger millions of hot
> reloads, depending on the scale of the cluster.
>
> While there is no research regarding the config-reloader efficiency, this
> particular container might become problematic for huge-scale environments.
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#wal-not-optimized-for-small-environments>WAL
> not optimized for small environments
>
> Prometheus Write-Ahead-log(WAL) is stored as a sequence of numbered files
> with 128MiB each by default. This means that, by default, at least 128MiB
> is needed for running Prometheus Agent if we ignore every other part of
> Prometheus. Using a sidecar, we're optimizing for horizontal scale and
> 128MiB might be much more than necessary to store metrics from a single Pod.
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#lack-of-high-availability-setup>Lack
> of High-Availability setup
>
> With the problem that Prometheus is not optimized for very small
> environments, injecting 2 sidecars per Pod sounds like a big waste of
> resources. However, with only 1 sidecar HA Prometheus won't be an option.
>
> With that said, having an HA Prometheus in the traditional deployment
> pattern seems to be more critical than the sidecar approach. That's because
> with Prometheus fails in the first approach we lose the monitoring stack
> for the whole cluster, while with the latter we just lose metrics from a
> pod.
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#references>
> References
>
>    - [1]
>    
> https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/designs/prometheus-agent.md
>    - [2] https://opentelemetry.io/docs/collector/scaling/
>    - [3]
>    
> https://www.acagroup.be/en/blog/auto-discovery-of-kubernetes-endpoint-services-prometheus/
>    - [4]
>    https://ganeshvernekar.com/blog/prometheus-tsdb-wal-and-checkpoint/
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Prometheus Developers" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/prometheus-developers/JHmnU8IVGMc/unsubscribe
> .
> To unsubscribe from this group and all its topics, send an email to
> prometheus-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-developers/d3e4d7c7-d79e-494a-bdcc-32ce2d04a88dn%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-developers/d3e4d7c7-d79e-494a-bdcc-32ce2d04a88dn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CAJqZosyGCKWHMPEHViCoN%2BeD%3DicARc-Y%2BFLBPF%3DQOsdXO8HZQg%40mail.gmail.com.

Reply via email to