[prometheus-users] Re: Replacing a "/" using regex

2020-02-28 Thread Brian Candler
- source_labels: [tube] target_label: tube regex: (.*)/(.*) replacement: $1:$2 However you'll have to repeat this rule N times if you want to be able to replace up to N slashes. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To

[prometheus-users] Re: Prometheus memory consumption keeps increasing

2020-02-27 Thread Brian Candler
7 million timeseries is a very large number. Is it what you're expecting? If not, then you should address this first. Maybe you have an exporter which has a very high cardinality label which shouldn't be there. You can get some clues from the prometheus GUI under Status > Runtime & Build

Re: [prometheus-users] Prometheus Datadog integration

2020-02-29 Thread Brian Candler
Maybe you've misunderstood scraping and federation. Prometheus scrapes are "pulls". When a prometheus scrape job runs, it makes an outbound HTTP connection to a remote server (an "exporter") and reads back metrics from it. So in your example here: 31 - job_name: 'prom'$ 32

[prometheus-users] Re: Textfile Collector reading only 1 prom file

2020-02-29 Thread Brian Candler
I can replicate your problem here, when creating 1.prom and 2.prom However, if I concatenate the two files into one, it works; I also note that the metrics with the same metric name are grouped together under the same heading (even though they weren't adjacent in the source). # HELP

[prometheus-users] Re: What is the best exporter to monitor logs pattern

2020-03-04 Thread Brian Candler
mtail is an alternative to grok_exporter. There is also promtail, the frontend component of loki, which can expose prometheus metrics: https://github.com/grafana/loki/blob/master/docs/clients/promtail/stages/metrics.md Outside of the prometheus ecosystem, there are applications which are

[prometheus-users] Re: Querying VIA Prometheus API

2020-03-04 Thread Brian Candler
It's a bad idea to query all metrics, as it will touch all timeseries and create a huge amount of work in I/O and memory usage. But you can, and on a small test system it will probably be OK: the query is {__name__=~".+"} On a production system: you always want to ensure you're not touching

[prometheus-users] Re: Exposing some custom metrics to Prometheus

2020-03-04 Thread Brian Candler
Yes: setting (e.g.) --collector.textfile.directory=/var/lib/node_exporter, and then creating files like /var/lib/node_exporter/foo.prom containing metrics in standard prometheus exposition format, will do what you want. There are also metrics exposed which give you the timestamp of each file,

[prometheus-users] Re: keep metric using metric_relabel_configs not working

2020-03-02 Thread Brian Candler
On Monday, 2 March 2020 06:45:56 UTC, Ankita Khot wrote: > > In Prometheus UI within the dropdown that says 'insert metric at cursor' > it shows the previous scraped metrics as well. How to remove those? I would > like it to show only the metrics which i am keeping through >

Re: [prometheus-users] Prometheus Datadog integration

2020-03-02 Thread Brian Candler
You can't get prometheus to connect to a remote system to "push" metrics [except using the remote_write functionality, which is a specific protocol the remote system needs to support] However, other servers can scrape or "pull" metrics from prometheus, by scraping the /federate endpoint. That

[prometheus-users] Re: Giving your own MIB file before generating the snmp.yml

2020-03-03 Thread Brian Candler
On Tuesday, 3 March 2020 11:30:11 UTC, Yagyansh S. Kumar wrote: > > I understood that it is reading the MIB files from the mibs directory. I > even copied my MIB file in that directory and set the environment variable, > but it seems that in the generator.yml file some modules are predefined >

[prometheus-users] Re: Prometheus memory consumption keeps increasing

2020-02-27 Thread Brian Candler
If you have a high churn rate on pods/containers then you probably don't want to be generating time series for each one. If you can identify the troublesome ones, e.g. if it's only pods with specific labels that are churning, then you can use relabelling to avoid scraping them. -- You

Re: [prometheus-users] Task metrics

2020-02-27 Thread Brian Candler
Would statsd_exporter be sufficient? (e.g. at the end of each job add 1 to counters for jobs run/success/failed as appropriate; add the time spent to a cumulative total time etc) -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To

[prometheus-users] Re: keep metric using metric_relabel_configs not working

2020-02-28 Thread Brian Candler
> Is there a way to filter the metrics at the source, so that while using the curl command for metric endpoint we will be able to see only the required metrics. That would be something you configure in whatever exporter you are using. For example, node_exporter has flags to turn various

[prometheus-users] Re: Prometheus blackbox-exporter ICMP ping results for all targets are showing up

2020-02-28 Thread Brian Candler
There are two different metrics you are looking at. (1) "up" says whether the scrape was successful - i.e. prometheus was able to communicate with blackbox_exporter and read the response (2) "probe_success" is one of the metrics returned from blackbox_exporter, saying whether it was able to

[prometheus-users] Re: Textfile Collector reading only 1 prom file

2020-02-28 Thread Brian Candler
Look at the stderr output from node_exporter. My guess is that one of the metrics is in an invalid format; if so, textfile_collector will report and abandon the rest of the file (maybe the rest of the directory - I haven't tested this) Another possibility is permissions on the files. You may

[prometheus-users] Re: Prometheus query with conditional operation

2020-02-27 Thread Brian Candler
It sounds like you need a histogram: https://prometheus.io/docs/concepts/metric_types/#histogram https://prometheus.io/docs/practices/histograms/ What this means is you generate separate buckets for different latencies (maybe: <0.1s, <0.2s, <0.5s, <1s, <2s, other) and increment the counts for

[prometheus-users] Re: Prometheus targets

2020-02-27 Thread Brian Candler
On Thursday, 27 February 2020 18:40:38 UTC, adi garg wrote: > > Thanks a lot, Brian, that clears a lot of things for me. Just one more > doubt is there a way to look at the current scraped metrics as I am not > sure but I think that Prometheus stores metrics in 2 hours chunk in the >

[prometheus-users] Re: keep metric using metric_relabel_configs not working

2020-02-28 Thread Brian Candler
If a pre-existing metric just vanishes, the most recent data point hangs around for a while. https://www.robustperception.io/staleness-and-promql -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop

[prometheus-users] Re: How to relabel instance names

2020-03-05 Thread Brian Candler
On Thursday, 5 March 2020 18:30:54 UTC, Danny de Waard wrote: > > If i reads the article correct i should be able to use this relabel: > - source_labels: [__address__] > regex: '(.+)/(.+)' # name/address > target_label: __address__ > replacement: '${2}' > and fill in

[prometheus-users] Re: Getting hostnames along with IPs in Alerts.

2020-03-05 Thread Brian Candler
On Thursday, 5 March 2020 19:23:51 UTC, Yagyansh S. Kumar wrote: > > Found my stupid mistake. It works now. Thanks! > I have another query though. This maybe just a ridiculous question, but is > there any way I can get the hostnames for the targets I configured as ICMP > ping targets for

[prometheus-users] Re: Snmp exporter - value preprocessing

2020-03-06 Thread Brian Candler
It's usual to keep the raw data as returned by the exporter. If the two manufacturers are using the *same* OID from the same standard MIB, but using different units, then one of them is doing it wrong. If the two manufacturers are returning different OIDs in different vendor-specific MIBs,

Re: [prometheus-users] disable local storage with Prometheus 2

2020-02-25 Thread Brian Candler
If you want a completely dumb scraper that doesn't store anything locally, VictoriaMetrics just released vmagent which does exactly that: https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md -- You received this message because you are subscribed to the Google

Re: [prometheus-users] Prometheus Datadog integration

2020-03-03 Thread Brian Candler
On Tuesday, 3 March 2020 08:52:15 UTC, adi garg wrote: > > One more doubt -> /metrics endpoint only shows metrics of Prometheus > server or the metrics of the whole system on which Prometheus is running. > The /metrics endpoint, on the prometheus server itself, shows only internal metrics about

[prometheus-users] Re: samples_per_second

2020-03-03 Thread Brian Candler
It is controlled by the prometheus server: it's whatever scrape_interval you set on the job which scrapes node_exporter. You can do a query to find out: e.g. count_by_time(up[5m]) will show the number of scrapes or scrape attempts in the last 5 minutes. -- You received this message because

Re: [prometheus-users] Prometheus Datadog integration

2020-03-03 Thread Brian Candler
Note: with curl, the -g flag disables curl's own "globbing" of special characters, so this simplifies to: curl -g '127.0.0.1:9090/federate?match[]={job="node"}' -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this

[prometheus-users] Re: How to aggregate data from different machines?

2020-03-03 Thread Brian Candler
On Tuesday, 3 March 2020 09:49:08 UTC, luv - wrote: > > I use Prometheus to scrape the machine. There are about 500 service > machines, and each machine generates about 3500 sample data. > 500 machines each generating 3500 different metrics is 1,750,000 timeseries. > When I use the

[prometheus-users] Re: Giving your own MIB file before generating the snmp.yml

2020-03-03 Thread Brian Candler
The generator does not "create its own set of MIB files" - it only reads in MIB files. You can set the environment variable MIBDIRS to tell it what directories you want it to look in:

Re: [prometheus-users] Re: Prometheus memory consumption keeps increasing

2020-02-27 Thread Brian Candler
On Thursday, 27 February 2020 16:21:15 UTC, Shadi Abdelfatah wrote: > > What metric can I use to get the values listed on /status under "Highest > Cardinality Labels " ? I want to setup an alarm on this but I'm not sure > how to get that value > See

[prometheus-users] Re: Prometheus targets

2020-02-27 Thread Brian Candler
On Thursday, 27 February 2020 17:53:41 UTC, adi garg wrote: > > I was able to select my hs2 metrics there for querying, but I couldn't see > them on localhost:9090/metrics. Is there any reason for it. > Yes. /metrics on the prometheus server only exposes metrics about the operation of the

[prometheus-users] Re: Prometheus federation vs remote_read

2020-02-18 Thread Brian Candler
OK - that's an unusual requirement, since you can build your scrape config from a script. I don't think your solution 3 will work. Remote_read sends queries to the remote_engine when handling promQL queries, but I don't think it ingests into the local TSDB. -- You received this message

[prometheus-users] Re: Prometheus federation vs remote_read

2020-02-18 Thread Brian Candler
It depends what you mean by "single endpoint". You can install exporter_exporter: you'll get a single endpoint listening on a single port, which proxies to the various backend exporters. This means you only need to open one firewall port. However you'll still have to scrape it multiple times,

[prometheus-users] Re: Data getting corrpted - missing meta.json

2020-02-19 Thread Brian Candler
On Wednesday, 19 February 2020 03:16:44 UTC, Guru SD wrote: > > We have a clustered Prometheus setup but with a common storage disk. > Please can you be more specific about what you mean by "common storage disk" and how it is configured. Do you have multiple prometheus servers accessing the

[prometheus-users] Re: Facing read: connection reset by peer error

2020-02-19 Thread Brian Candler
Is it http or https? What do the following commands show? curl -v http://:9080/metrics curl -vk https://:9080/metrics If it's https then you'll need "scheme: https". If it's http then you don't need "tls_config". -- You received this message because you are subscribed to the Google Groups

[prometheus-users] Re: Alertmanager webhook fails with 509 cert error

2020-02-19 Thread Brian Candler
https://prometheus.io/docs/alerting/configuration/#tls_config which goes under https://prometheus.io/docs/alerting/configuration/#http_config which goes under https://prometheus.io/docs/alerting/configuration/#webhook_config Either point ca_file to a copy of the certificate used to sign the

[prometheus-users] Re: Configure scrape job with params: name based on fqdn / md5 / obfuscating function.

2020-02-19 Thread Brian Candler
> Hi, I'm hitting some Netdata exporters and I need to identify the "name" of my prometheus server on the http requests that collects the metrics. The prometheus server can add extra labels as part of the scrape. Some of the service discovery methods have the ability to add labels (e.g.

Re: [prometheus-users] Re: How do you deal with rarely updated values

2020-02-20 Thread Brian Candler
I think a gauge is the right thing here: it represents the amount of time taken to rebuild the last configuration. You can just scrape this periodically (say every minute), and easily see if it goes up or down over time. Of course, most of the time you'll be scraping the same value - but

Re: [prometheus-users] Re: sometimes I just received a resolved email but not firing email

2020-02-16 Thread Brian Candler
On 16/02/2020 09:52, bryan wrote: okay, I checked the alertmanager's log, for example, see below: level=debug ts=2020-02-16T00:41:46.642829251Z caller=dispatch.go:104 component=dispatcher msg="Received alert" alert=Watchdog[e1749c6][active] level=debug ts=2020-02-16T00:41:49.181048151Z

[prometheus-users] Re: sometimes I just received a resolved email but not firing email

2020-02-16 Thread Brian Candler
If I understand correctly, prometheus doesn't send any "resolved" message to alertmanager: it just stops sending alerts. Alertmanager treats the lack of alert as meaning "resolved". Therefore, if you receive the "resolved" message, then this proves that alertmanager must have received the

Re: [prometheus-users] Re: sometimes I just received a resolved email but not firing email

2020-02-16 Thread Brian Candler
Correction: I've just tried this again, and if I shut down the SMTP server, I *do* see failed SMTP attempts from alertmanager generating logs, at increasing retry intervals: Feb 16 12:18:41 prometheus alertmanager[1772]: level=debug ts=2020-02-16T12:18:41.928Z caller=dispatch.go:465

Re: [prometheus-users] Re: sometimes I just received a resolved email but not firing email

2020-02-16 Thread Brian Candler
On 16/02/2020 10:09, bryan wrote: yes, I'm running an alertmanager cluste, and I have turn on prometheus "debug" level logging, but nothing could be found, for details: Have you set --log.level=debug on the alertmanager processes as well? I see the following in my (non-clustered) test

Re: [prometheus-users] AWS cross region Node Exporter monitoring issue

2020-02-20 Thread Brian Candler
Grafana can query multiple prometheus data sources: https://www.robustperception.io/switching-between-prometheus-servers-in-grafana-using-data-source-variables That requires the user to select the required data source from a drop-down. Alternatively, can also put promxy in front of the

[prometheus-users] Re: Sum distinct values

2020-03-11 Thread Brian Candler
You can write your own code which queries the prometheus API and does whatever you like with the data. PromQL can't do what you're asking, because in general it is not sensible for processing timeseries. The *definition* of a timeseries is a collection of values with a unique set of labels.

[prometheus-users] Re: How to hide auth password writing in alertmanager for alerting

2020-03-12 Thread Brian Candler
On Thursday, 12 March 2020 10:14:56 UTC, mohd wrote: > > The one who has direct access to the filesystem of the docker container. > You can't. You can have an encrypted filesystem, but your Ubuntu user will be able to read it as well. You could: - keep the config encrypted in a gpg file -

[prometheus-users] Re: Best method to attach custom collector to Node-exporter

2020-03-12 Thread Brian Candler
On Thursday, 12 March 2020 10:11:54 UTC, Ankit Rohilla wrote: > > I can use the existing node-exporter image as the base image for my > dockerfile. But where will I copy my additional collectors code? > You would copy it into the new docker image that you are building from the base image. Look

[prometheus-users] Re: Best method to attach custom collector to Node-exporter

2020-03-12 Thread Brian Candler
You can build your own docker image with a Dockerfile that uses the existing node_exporter image as a base, and just add the extra files you want. That sounds like what you want. You can also build an image from scratch which extracts specific files from another docker image (e.g. just the

Re: [prometheus-users] HTTP API request for querying multiple metrics

2020-03-12 Thread Brian Candler
On Wednesday, 11 March 2020 22:38:11 UTC, Hakim Kahlouche wrote: > > Thanks for your reply. > > My question is about the PromQL calculation on metrics. > > Let's say I want to query the following: > > curl ' > http://192.168.56.103:9090/api/v1/query?query=go_memstats_gc_cpu_fraction' > curl

[prometheus-users] Re: "All" value in any variable is creating issue while using Alertmanager as a datasource in Grafana.

2020-03-12 Thread Brian Candler
If you are using multi-select in Grafana, you have to write your label match as *cluster=~"$cluster"* -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to

[prometheus-users] Re: How to hide auth password writing in alertmanager for alerting

2020-03-12 Thread Brian Candler
Hide it from whom? Someone who uses the alertmanager web interface? Someone who has direct access to the filesystem of the docker container? -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop

Re: [prometheus-users] HTTP API request for querying multiple metrics

2020-03-12 Thread Brian Candler
The /federate endpoint *is* part of the HTTP API. Just think of it as another query interface. https://prometheus.io/docs/prometheus/latest/federation/#configuring-federation (It just happens also to be how one prometheus server queries another, but it's not limited to that) -- You received

[prometheus-users] Re: query AlertManager

2020-03-10 Thread Brian Candler
I'm not sure exactly what you're trying to do with grafana, but I use karma as alerting dashboard and it does a good job of showing grouped alerts, as well as making a view of multiple alertmanagers in different data centres and being able to push out global silences. -- You received this

[prometheus-users] Re: Blackbox are not recognizing another modules than not is http_2xx

2020-03-06 Thread Brian Candler
In your relabeling, you also need to copy label "module" to "__param_module" (otherwise it won't get sent to the exporter) -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it,

Re: [prometheus-users] Re: Blackbox are not recognizing another modules than not is http_2xx

2020-03-06 Thread Brian Candler
On Friday, 6 March 2020 18:56:08 UTC, Ricardo Estalder wrote: > > - source_labels: [module, __address__] > target_label: __param_target > replacement: $1 > action: replace > Not sure what's going on: there's no need to join labels together, and then later on you

[prometheus-users] Re: Getting hostnames along with IPs in Alerts.

2020-03-05 Thread Brian Candler
On Thursday, 5 March 2020 16:13:33 UTC, Brian Candler wrote: > > The second approach, specifically for your problem of having the hostnames > in alerts, is to put a meaningful name in the "instance" label, rather than > the IP address. This approach is described here: >

[prometheus-users] Re: Getting hostnames along with IPs in Alerts.

2020-03-05 Thread Brian Candler
Yes. There are two basic approaches. The first (and more complex) is to join node_uname_info to pick extra labels to add to your alert. Example (untested): - alert: DiskFull expr: | (node_filesystem_avail_bytes < 1 and node_filesystem_size_bytes > 1) * on

[prometheus-users] Re: How to relabel instance names

2020-03-05 Thread Brian Candler
On Thursday, 5 March 2020 17:20:01 UTC, Danny de Waard wrote: > > I have been reading all across the internet for a way to relabel my > instance name. > Now all my dashboard have long instance names like lsrv1.server.nl:9100 > > I would like to have that reduced to lsrv1 > The blog post at

[prometheus-users] Re: Questions about security authentication in Prometheus

2020-03-07 Thread Brian Candler
If this is non-TLS, you can trace the actual http exchange with tcpdump. tcpdump -i eth0 -nn -s0 -A tcp port 12345 Replace eth0 with external interface (or lo if you are scraping 127.0.0.1), and 12345 with exporter port number. The IP and TCP headers will show as garbage, but you should see

[prometheus-users] Re: Prometheus alerting rules test for counters

2020-03-09 Thread Brian Candler
The great thing about prometheus alerting rules is you can just enter them into the GUI as normal queries. If the graph is blank, there's no alert. If it's non-blank (i.e. there are timeseries visible) then these are the timeseries which would trigger an alert. This makes them easy to debug.

[prometheus-users] Re: Prometheus alerting rules test for counters

2020-03-09 Thread Brian Candler
BTW, I think that rule would be more robust against missing values by using expr: increase(metric_name[15m]) == 0 instead of using "for:". If you use "for:" then the condition must be true for every single evaluation, and a single missed sample may reset the alert. -- You received this

Re: [prometheus-users] [ALERTMANAGER][ERROR] err="Post : x509: certificate signed by unknown authority"

2020-03-09 Thread Brian Candler
On Monday, 9 March 2020 09:12:41 UTC, BDT wrote: > > level=debug ts=2020-03-09T08:46:12.143Z caller=notify.go:667 > component=dispatcher msg="Notify attempt failed" attempt=1 > integration=slack receiver=slack_general err="Post : x509: > certificate signed by unknown authority" > > > The bit

Re: [prometheus-users] [ALERTMANAGER][ERROR] err="Post : x509: certificate signed by unknown authority"

2020-03-09 Thread Brian Candler
On Monday, 9 March 2020 14:59:45 UTC, BDT wrote: > > The doc of alertmanager: > It is here: https://prometheus.io/docs/alerting/configuration/ At the top level is an item called "receivers" Under this is a list of items of type

Re: [prometheus-users] Re: Prometheus alerting rules test for counters

2020-03-09 Thread Brian Candler
On Monday, 9 March 2020 14:47:28 UTC, Debashish Ghosh wrote: > > Thanks brian .. that answers most of my questions ... Regarding using > [15m] in the increase we have purposely kept it [2m] that runs for 15 > minutes since we are really tracking something continuously to be true all > the time

[prometheus-users] Re: NOT All the Sharded Prometheus scrape targets successfully

2020-03-14 Thread Brian Candler
It's a sharded config. That means each target is only scraped by one of the three nodes (or put another way: each node only scrapes one third of the targets given). All three nodes have the same config with one node: - job_name: 'node_exporter' # metrics_path defaults to '/metrics'

[prometheus-users] Re: Sum distinct values

2020-03-10 Thread Brian Candler
Can you share the raw metrics, i.e. vmware_vm_guest_disk_capacity{vm_name="xxx.com"} and then explain what you're trying to extract from them? -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop

[prometheus-users] Re: IPv6 Support

2020-03-11 Thread Brian Candler
All works over IPv6 for me. With blackbox_exporter, if you give DNS names for targets, you have the option whether to prefer v4 over v6, or only use v4 or only use v6, when probing an endpoint. Is it possible to deactivate IPv4? > > > You can bind the listener so it only accepts connections on

Re: [prometheus-users] Best method to attach custom collector to Node-exporter

2020-03-15 Thread Brian Candler
Ah sorry, I thought OP was using code which writes files for the textfile-collector. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to

[prometheus-users] Re: can I dynamically set alert expression and not have predefined or hardcoded rules

2020-04-08 Thread Brian Candler
Like I said: if you want to create rules dynamically, then write out the rules and drop them into a file which prometheus will pick up. But I suspect you ought to gain some familiarity with writing alerting rules by hand first. I have one more question, what type of expressions I can put in

[prometheus-users] Re: How do you name the results of a query?

2020-04-09 Thread Brian Candler
Sure: some PromQL functions return no metric name and/or no labels. That's how those functions operate. For example, sum(metric) doesn't return any labels, because it's summing together all the timeseries with that particular metric. Each individual metric, by definition, has a different set

[prometheus-users] Re: Label Multiple Instances Name to Static Name

2020-04-10 Thread Brian Candler
Use file_sd. It's the same as static_configs (i.e. list of targets+labels), but bundled into a separate file, which makes it easier to edit or generate. Also, prometheus picks up changes automatically without a HUP. It's true though that each group of targets will need to be configured with

[prometheus-users] Re: How to get rid of extra url characters at the end of prometheus targets url

2020-04-10 Thread Brian Candler
%3A is the URL-encoding for a colon (ASCII code hex 3A = decimal 58) So your URL includes "target=10.146.x.x:9116", encoded because colon is a character with special significance in URLs. And in turn, I would guess that your

[prometheus-users] Re: Making prometheus configurable based on instance type

2020-04-10 Thread Brian Candler
You'll need to measure it for your usage case, and it changes from release to release, but you can get a first estimate of the lower bound here: https://www.robustperception.io/how-much-ram-does-prometheus-2-x-need-for-cardinality-and-ingestion

[prometheus-users] Re: Exporter to collect metrics from telegraf and use alertmanager to alert using that metrics.(VMware exporter not working)

2020-04-09 Thread Brian Candler
Sorry, you're on the wrong mailing list. Questions about the TICK stack (Telegraf, Influxdb, Chronograf, Kapacitor) should be directed at: https://community.influxdata.com/ -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe

[prometheus-users] Re: relabel for scrapping port issue

2020-04-09 Thread Brian Candler
- source_labels: [__meta_kubernetes_pod_container_port_name] # try to keep the port only if the name is metrics action: keep regex: metrics will drop any target that doesn't have that label with that value. If you want to keep targets that *don't* have that label, then allow empty

[prometheus-users] Re: Pattern(s) for having a counter in short-lived applications (batch jobs)

2020-04-15 Thread Brian Candler
pushgateway doesn't aggregate counters, but statsd_exporter does. This sounds more like what you need. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to

[prometheus-users] Re: Supervise http/https Form authentication

2020-04-15 Thread Brian Candler
Yes: increment a counter whenever there's a failed login. Scrape that counter from prometheus. You can either instrument your web application directly with one of the many prometheus client libraries out there; or you can use statsd_exporter which will maintain the counters for you (and you

[prometheus-users] Re: Help Required for Prometheus config || multiple Labels Segregation on the basis of multiple targets in single single job name

2020-04-18 Thread Brian Candler
Yes that's correct, but you'll find it easier to use file_sd_configs. The YAML is basically the same: - targets: - ... - ... labels: ...: ... but by putting it in a separate file, you don't need to touch prometheus.yml. Also, prometheus automatically picks up when file_sd files

Re: [prometheus-users] Discrepancy in Resolved Alerts.

2020-04-18 Thread Brian Candler
I can see two possible issues here. Firstly, the value of the annotation you see in the resolved messsage is always the value at the time *before* the alert resolved, not the value which is now below the threshold. Let me simplify your expression to: foo > 85 This is a PromQL filter. In

Re: [prometheus-users] how to troubleshoot alert rule not loading

2020-04-18 Thread Brian Candler
On Saturday, 18 April 2020 01:15:12 UTC+1, Augustin Husson wrote: > > check that the alertRule file is in the same directory than the prometheus > binary. > > If you use relative paths, I believe they are relative to the directory containing the prometheus.yml file (not the directory containing

Re: [prometheus-users] Discrepancy in Resolved Alerts.

2020-04-18 Thread Brian Candler
Because it's the presence of a value which triggers an alert, and the absence of a value which means the end of an alert. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it,

[prometheus-users] Re: blackbox exporter: ICMP probes fails continually after some short DNS outages, until manual restart of the blackbox-exporter container

2020-04-07 Thread Brian Candler
15:28:48.734661 IP 172.17.0.5 > 8.8.8.8: ICMP echo request, id 33313, seq 41979, length 36 0x: 4500 0038 f40e 4000 4001 8a90 ac11 0005 E..8..@.@... 0x0010: 0808 0808 0800 7648 8221 a3fb 5072 6f6d ..vH.!..Prom 0x0020: 6574 6865 7573 2042 6c61 636b 626f

[prometheus-users] Re: blackbox exporter: ICMP probes fails continually after some short DNS outages, until manual restart of the blackbox-exporter container

2020-04-07 Thread Brian Candler
On Tuesday, 7 April 2020 15:53:24 UTC+1, Tomáš Bartek wrote: > > > sudo tcpdump -i ens160 -n -X icmp and host 8.8.8.8 > tcpdump: verbose output suppressed, use -v or -vv for full protocol decode > listening on ens160, link-type EN10MB (Ethernet), capture size 262144 bytes > 15:51:57.237795 IP

[prometheus-users] Re: node_processes_* metrics is missing in 0.18.1

2020-04-07 Thread Brian Candler
Then the exporter doesn't appear to give separate metrics for pids broken down by user. You could write some code to do this yourself (e.g. with "ps") and write out a file to be picked up by textfile_collector. -- You received this message because you are subscribed to the Google Groups

[prometheus-users] Re: node_processes_* metrics is missing in 0.18.1

2020-04-07 Thread Brian Candler
Show an example? If there is a label giving the username/userid, you can filter the query with promQL. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to

[prometheus-users] Re: node_network_transmit_bytes_total

2020-04-07 Thread Brian Candler
That's an average over 60 seconds. How long did the wget transfer take (or what total size was it?) Also: if you're downloading with wget, you probably want node_network_receive_bytes_total. The amount of data transmitted will be mainly the ACKs. -- You received this message because you

[prometheus-users] Re: node_processes_* metrics is missing in 0.18.1

2020-04-07 Thread Brian Candler
It's a disabled-by-default exporter: https://github.com/prometheus/node_exporter/#disabled-by-default Yes, you have to enable it with a command line flag. https://github.com/prometheus/node_exporter/#collectors Collectors are enabled by providing a --collector. flag. Collectors that are enabled

[prometheus-users] Re: blackbox exporter: ICMP probes fails continually after some short DNS outages, until manual restart of the blackbox-exporter container

2020-04-07 Thread Brian Candler
In your first example, which shows a failing blackbox_exporter ping in hex: can you shows this both when it's working and after it fails? Might show something different in the packets being sent out. And in your second example (ping from the command line working, whilst blackbox_exporter ping

[prometheus-users] Re: Start recording rule until N mins of data is collected

2020-04-07 Thread Brian Candler
> What happens when I do rate(metricsX[5m]) but only 1 min of data is present? rate() takes the first and last point within the given time window, and calculates the rate between them. So: as long as there are at least two data points in the window(*), you'll get a result. If not, you'll get

[prometheus-users] Re: can I dynamically set alert expression and not have predefined or hardcoded rules

2020-04-08 Thread Brian Candler
Yes: write a script to write the new alerting rules to a file. Prometheus can be configured to pick up all the files in a directory, e.g. rule_files: - "rules/*.yml" Note: you will need to signal (e.g. "killall -HUP

Re: [prometheus-users] Re: Start recording rule until N mins of data is collected

2020-04-08 Thread Brian Candler
On Wednesday, 8 April 2020 00:14:04 UTC+1, Murali Krishna Kanagala wrote: > > Every time the data is scrapped Prometheus does that calculation and > record the custom metric. 5min here is a sliding window with latest > metric's time stamp as the curren time and the previous one (if it exists >

[prometheus-users] Re: Historical exposed metrics memorization

2020-04-08 Thread Brian Candler
Whenever prometheus scrape an exporter, it automatically creates any timeseries which don't exist (i.e. unique combination of metric name + labels) Therefore if your exporter is continuing to expose those timeseries - which you can easily check with a curl scrape - they will be recreated.

[prometheus-users] Re: Prometheus exporter - reading CSV file having data from the past day

2020-04-14 Thread Brian Candler
Sorry, but I'm afraid you cannot backfill historical data into prometheus. Prometheus will only scrape the current/latest value. Backfill is a feature being considered for the future . For now, you will need to look at a

[prometheus-users] Re: HowTo use fields or labels in Alertmanager when using the webhook receiver

2020-04-20 Thread Brian Candler
On Monday, 20 April 2020 13:37:42 UTC+1, Danny de Waard wrote: > > If i read correctly you refer to the payload which should contain fields i > can get from the payload. > Is this an example of such a payload?? > It looks like it to me. There's an official example in the documentation at

[prometheus-users] Re: Help Required for Prometheus config || multiple Labels Segregation on the basis of multiple targets in single single job name

2020-04-19 Thread Brian Candler
$1 is a capture group - a parenthesised expression with the regexp - so you need to include the parentheses to capture the value you want: regex: '(172*)' However, in a regexp, "*" means "zero or more instances of the preceeding character". So what you've written matches "17" followed by zero

Re: [prometheus-users] Sum of different Time Series for common lable

2020-04-19 Thread Brian Candler
You are adding different *metrics*?? Usually this is the wrong thing to do, because different metrics have different meanings, and adding different types of things generally makes no sense. This might mean you want a single metric with more labels to distinguish them. However in the case of

Re: [prometheus-users] Slicing the metrics scraped by HAPROXY Exporter.

2020-04-21 Thread Brian Candler
Those metrics like haproxy_frontend_http_requests_total are counters. If you want see much they've increased in the last 24 hours, then you need to use this function : increase(haproxy_frontend_http_requests_total[24h])

Re: [prometheus-users] StartsAt time is right but endsAt time in alertmanager API is not matching. What does endsAt time actually stands for ?

2020-04-21 Thread Brian Candler
Alertmanager doesn't cover all use cases, and in particular I don't think does much in the way of time-based escalation. However you can forward alerts to a higher-level management system like OpsGenie, VictorOps, PagerDuty etc. > -- You received this message because you are subscribed to

[prometheus-users] Re: Reg: change the device label for filesystem or netstat metrics.

2020-04-21 Thread Brian Candler
Grafana provides this function: *label*_*values(metric, label)* to get the label values for a specific metric only, which sounds like what you need - just pick one metric which has the network devices,

Re: [prometheus-users] Slicing the metrics scraped by HAPROXY Exporter.

2020-04-21 Thread Brian Candler
On Tuesday, 21 April 2020 10:26:47 UTC+1, Yagyansh S. Kumar wrote: > > If I want to see the number of requests that increased on let say 17th > April. How to approach that? > > If querying through the API: choose an instant for your query which is 00:00:00 on 18th April.

Re: [prometheus-users] Slicing the metrics scraped by HAPROXY Exporter.

2020-04-21 Thread Brian Candler
Not sure what you mean - time difference between what and what? -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: [prometheus-users] Prometheus pods are going to crashloopbackoff

2020-04-20 Thread Brian Candler
By giving each pod its own storage volume. That's what "stateful" means here - "I need to maintain my own state". -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an

Re: [prometheus-users] Excluding particular network interfaces from monitoring for some servers.

2020-04-20 Thread Brian Candler
Please think carefully about what you've just written. What you've said in effect is: "My configuration isn't working. Please help me. But I'm not going to show you my configuration. I'm also not going to tell you what unexpected behaviour I'm seeing. I want you to guess what the problem

  1   2   3   4   5   6   7   8   9   10   >