Re: [prometheus-developers] Introduce the concept of scrape Priority for Targets

Frederic Branczyk Thu, 30 Jul 2020 01:37:17 -0700

That's only effective in limiting the number of targets, the point here is
that selectively scraping those with a higher priority based on
backpressure of the system as a whole.


On Wed, 22 Jul 2020 at 17:00, Julien Pivotto <roidelapl...@prometheus.io>
wrote:

> On 22 Jul 16:47, Frederic Branczyk wrote:
> > In practice even that can still be problematic. You only know that
> > Prometheus has a problem when everything fails, the point is to keep
> things
> > alive well enough for more critical components.
> >
> > On Wed, 22 Jul 2020 at 16:38, Julien Pivotto <roidelapl...@prometheus.io
> >
> > wrote:
> >
> > > On 22 Jul 16:36, Frederic Branczyk wrote:
> > > > It's unclear how that helps, can you help me understand?
> > >
> > > - job: highprio
> > >   relabel_configs:
> > >   - target_label: job
> > >     replacement: pods
> > >   - source_labels: [__meta_pod_priority]
> > >     regex: high
> > >     action: keep
>
> highprio job will always be scraped.
>
> > > - job: lowprio
> > >   relabel_configs:
> > >   - target_label: job
> > >     replacement: pods
> > >   - source_labels: [__meta_pod_priority]
> > >     regex: high
> > >     action: drop
> > >   target_limit: 1000
> > >
> > > >
> > > > On Wed, 22 Jul 2020 at 16:34, Julien Pivotto <
> roidelapl...@prometheus.io
> > > >
> > > > wrote:
> > > >
> > > > > On 22 Jul 16:32, Frederic Branczyk wrote:
> > > > > > Can you explain what you mean by two jobs? Do you mean two scrape
> > > > > configs?
> > > > >
> > > > > Yes.
> > > > >
> > > > > >
> > > > > > On Wed, 22 Jul 2020 at 11:40, Julien Pivotto <
> > > roidelapl...@prometheus.io
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > On 22 Jul 02:35, Lili Cosic wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wednesday, 22 July 2020 11:23:00 UTC+2, Brian Brazil
> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, 22 Jul 2020 at 10:18, Julien Pivotto <
> > > > > roidel...@prometheus.io
> > > > > > > > > <javascript:>> wrote:
> > > > > > > > >
> > > > > > > > >> On 22 Jul 02:14, Lili Cosic wrote:
> > > > > > > > >> > Only now seen in the docs that I am supposed to start
> any
> > > > > > > discussions
> > > > > > > > >> here
> > > > > > > > >> > first before opening an issue, sorry about that! :)
> > > > > > > > >> >
> > > > > > > > >> > Currently there is no way of a target to have higher
> scrape
> > > > > > > priority
> > > > > > > > >> over
> > > > > > > > >> > another, but if you have a setup and even if you set
> target
> > > > > limits
> > > > > > > and
> > > > > > > > >> > sample limits you can still overestimate your setup, you
> > > still
> > > > > want
> > > > > > > to
> > > > > > > > >> have
> > > > > > > > >> > a higher priority targets that are preferred over the
> entire
> > > > > > > Prometheus
> > > > > > > > >> to
> > > > > > > > >> > fail. It would need to be based on the inability to
> ingest
> > > into
> > > > > > > tsdb on
> > > > > > > > >> the
> > > > > > > > >> > current rate we are scrapping, if that is hit the
> priority
> > > class
> > > > > > > would
> > > > > > > > >> take
> > > > > > > > >> > affect and only the highest priority targets would be
> > > scrapped
> > > > > in
> > > > > > > > >> favour of
> > > > > > > > >> > lower priority. Another option which might be simpler
> would
> > > be
> > > > > to
> > > > > > > have
> > > > > > > > >> a
> > > > > > > > >> > global limit on how much prometheus can handle based on
> perf
> > > > > > > testing.
> > > > > > > > >> >
> > > > > > > > >> > This would be treated as a last resort, and there would
> > > > > definitely
> > > > > > > be a
> > > > > > > > >> > need for a high severity alert to inform the admin that
> > > > > something
> > > > > > > went
> > > > > > > > >> > terribly wrong, but because we would still be able to
> ingest
> > > > > > > Prometheus
> > > > > > > > >> > metrics for example if they are higher priority class
> > > alerting
> > > > > > > would be
> > > > > > > > >> > possible.
> > > > > > > > >>
> > > > > > > > >> Hi,
> > > > > > > > >>
> > > > > > > > >> I think that limiting the number of targets you scrape is
> > > already
> > > > > a
> > > > > > > last
> > > > > > > > >> resort. I don't think we would need a second line of
> defense.
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > > I agree with Julien here. If you've gotten to this point
> you're
> > > > > > > already
> > > > > > > > > seriously overloaded, and prioritising individual targets
> is
> > > just
> > > > > > > > > rearranging the deckchairs at that point.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >>
> > > > > > > > >> You can achieve this priority by setting 2 jobs, one
> which is
> > > > > limited
> > > > > > > > >> and one which is not, and use relabeling to decinde which
> > > target
> > > > > is
> > > > > > > > >> going in which job.
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > > Or more generally, one Prometheus for the important
> targets and
> > > > > > > another
> > > > > > > > > for the less important and riskier targets.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I get your point completely Brian, and agree to some degree
> but
> > > > > people
> > > > > > > are
> > > > > > > > still going to be setting up a multi tenant prometheus which
> then
> > > > > causes
> > > > > > > > the above problems I mentioned. Even within the riskier
> targets
> > > there
> > > > > > > will
> > > > > > > > be some more important than others for users. I think we
> should
> > > still
> > > > > > > > strive to making a single shared Prometheus as safe as
> possible,
> > > if
> > > > > this
> > > > > > > is
> > > > > > > > not the priority class I suggested, open to other ideas!
> > > > > > >
> > > > > > > Then 2 jobs are the answer, one unlimited and one limited.
> > > > > > >
> > > > > > > The target_limit is already pretty advanced use case.
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Brian
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >>
> > > > > > > > >> >
> > > > > > > > >> > We could model this on something like PriorityClass
> > > > > > > > >> > <
> > > > > > > > >>
> > > > > > >
> > > > >
> > >
> https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
> > > > > >
> > > > > > >
> > > > > > > > >> from
> > > > > > > > >> > Kubernetes, but I am open to other suggestions.
> > > > > > > > >>
> > > > > > > > >> That could be used in relabeling as I said.
> > > > > > > > >>
> > > > > > > > >> >
> > > > > > > > >> > I am open to other suggestions, or maybe there is
> something
> > > like
> > > > > > > this
> > > > > > > > >> but I
> > > > > > > > >> > missed it. The main purpose is to ensure there are
> > > protection
> > > > > > > > >> mechanisms in
> > > > > > > > >> > place, so any ideas and suggestions welcome!
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >> regards,
> > > > > > > > >>
> > > > > > > > >> > Thanks and kind regards,
> > > > > > > > >> > Lili
> > > > > > > > >> >
> > > > > > > > >> > --
> > > > > > > > >> > You received this message because you are subscribed to
> the
> > > > > Google
> > > > > > > > >> Groups "Prometheus Developers" group.
> > > > > > > > >> > To unsubscribe from this group and stop receiving emails
> > > from
> > > > > it,
> > > > > > > send
> > > > > > > > >> an email to
> > > prometheus-developers+unsubscr...@googlegroups.com
> > > > > > > > >> <javascript:>.
> > > > > > > > >> > To view this discussion on the web visit
> > > > > > > > >>
> > > > > > >
> > > > >
> > >
> https://groups.google.com/d/msgid/prometheus-developers/30df615e-5420-4bdf-9cb7-2790ef19d520o%40googlegroups.com
> > > > > > > > >> .
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> --
> > > > > > > > >> Julien Pivotto
> > > > > > > > >> @roidelapluie
> > > > > > > > >>
> > > > > > > > >> --
> > > > > > > > >> You received this message because you are subscribed to
> the
> > > Google
> > > > > > > Groups
> > > > > > > > >> "Prometheus Developers" group.
> > > > > > > > >> To unsubscribe from this group and stop receiving emails
> from
> > > it,
> > > > > > > send an
> > > > > > > > >> email to
> prometheus-developers+unsubscr...@googlegroups.com
> > > > > > > <javascript:>
> > > > > > > > >> .
> > > > > > > > >> To view this discussion on the web visit
> > > > > > > > >>
> > > > > > >
> > > > >
> > >
> https://groups.google.com/d/msgid/prometheus-developers/20200722091759.GA140540%40oxygen
> > > > > > > > >> .
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Brian Brazil
> > > > > > > > > www.robustperception.io
> > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > You received this message because you are subscribed to the
> > > Google
> > > > > > > Groups "Prometheus Developers" group.
> > > > > > > > To unsubscribe from this group and stop receiving emails
> from it,
> > > > > send
> > > > > > > an email to prometheus-developers+unsubscr...@googlegroups.com
> .
> > > > > > > > To view this discussion on the web visit
> > > > > > >
> > > > >
> > >
> https://groups.google.com/d/msgid/prometheus-developers/b0b9e5f7-239a-4cc7-9108-9e6e015a30d6o%40googlegroups.com
> > > > > > > .
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Julien Pivotto
> > > > > > > @roidelapluie
> > > > > > >
> > > > > > > --
> > > > > > > You received this message because you are subscribed to the
> Google
> > > > > Groups
> > > > > > > "Prometheus Developers" group.
> > > > > > > To unsubscribe from this group and stop receiving emails from
> it,
> > > send
> > > > > an
> > > > > > > email to prometheus-developers+unsubscr...@googlegroups.com.
> > > > > > > To view this discussion on the web visit
> > > > > > >
> > > > >
> > >
> https://groups.google.com/d/msgid/prometheus-developers/20200722094024.GA175281%40oxygen
> > > > > > > .
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > You received this message because you are subscribed to the
> Google
> > > > > Groups "Prometheus Developers" group.
> > > > > > To unsubscribe from this group and stop receiving emails from it,
> > > send
> > > > > an email to prometheus-developers+unsubscr...@googlegroups.com.
> > > > > > To view this discussion on the web visit
> > > > >
> > >
> https://groups.google.com/d/msgid/prometheus-developers/CAOs1Umx-uFZFPoeOMA-ev4oN5QoRUyODiCWnSZML3hessHkmBQ%40mail.gmail.com
> > > > > .
> > > > >
> > > > > --
> > > > > Julien Pivotto
> > > > > @roidelapluie
> > > > >
> > > >
> > > > --
> > > > You received this message because you are subscribed to the Google
> > > Groups "Prometheus Developers" group.
> > > > To unsubscribe from this group and stop receiving emails from it,
> send
> > > an email to prometheus-developers+unsubscr...@googlegroups.com.
> > > > To view this discussion on the web visit
> > >
> https://groups.google.com/d/msgid/prometheus-developers/CAOs1UmzgPKCrpmsDb4v3CrN9Oe%2Bmaka8bosCDuodmjmd-RAyLw%40mail.gmail.com
> > > .
> > >
> > > --
> > > Julien Pivotto
> > > @roidelapluie
> > >
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "Prometheus Developers" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to prometheus-developers+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-developers/CAOs1UmyxR%3DQ%2B6_emwh12CVwkwemU%2B-tzenvgP1WQ%2BCHnw67UUQ%40mail.gmail.com
> .
>
> --
> Julien Pivotto
> @roidelapluie
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CAOs1UmwjYgxU9ABkATe04febF_010n3%3DKVoEm8J_5XGnf0je%2Bg%40mail.gmail.com.

Re: [prometheus-developers] Introduce the concept of scrape Priority for Targets

Reply via email to