subject:"Best way to emit custom metrics to Prometheus in spark structured streaming"

Re: Best way to emit custom metrics to Prometheus in spark structured streaming

2020-11-04 Thread meetwes

So I tried it again in standalone mode (spark-shell) and the df.observe() functionality works. I tried sum, count, conditional aggregations using 'when', etc and all of this works in spark-shell. But, with spark-on-k8s, cluster mode, only using lit() as the aggregation column works. No other aggreg

Re: Best way to emit custom metrics to Prometheus in spark structured streaming

2020-11-04 Thread meetwes

Hi, Thanks for the reply. I tried it out today but I am unable to get it to work in cluster mode. The aggregation result is always 0. It works fine in standalone however with spark shell but with spark on Kubernetes in cluster mode, it doesn't. -- Sent from: http://apache-spark-user-list.1001560

Re: Best way to emit custom metrics to Prometheus in spark structured streaming

2020-11-02 Thread Jungtaek Lim

You can try out "Dataset.observe" added in Spark 3, which enables arbitrary metrics to be logged and exposed to streaming query listeners. On Tue, Nov 3, 2020 at 3:25 AM meetwes wrote: > Hi I am looking for the right approach to emit custom metrics for spark > structured streaming job. *Actual S

Best way to emit custom metrics to Prometheus in spark structured streaming

2020-11-02 Thread meetwes

Hi I am looking for the right approach to emit custom metrics for spark structured streaming job.*Actual Scenario:* I have an aggregated dataframe let's say with (id, key, value) columns. One of the kpis could be 'droppedRecords' and the corresponding value column has the number of dropped records.