Re: [PR] Add blogpost for new Prometheus connector [flink-web]

via GitHub Tue, 26 Nov 2024 06:43:40 -0800


hlteoh37 commented on code in PR #766:
URL: https://github.com/apache/flink-web/pull/766#discussion_r1858129566



##########
docs/content/posts/2024-11-26-introducing-new-prometheus-connector.md:
##########
@@ -0,0 +1,201 @@
+---
+title:  "Introducing the new Prometheus connector"
+date: "2024-11-26T00:00:00.000Z"
+authors:
+- nicusX:
+  name: "Lorenzo Nicora"
+---
+
+
+We are excited to announce a new sink connector that enables writing data to 
Prometheus 
([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)).
 This articles introduces the main features of the connector, the reasoning 
behind design decisions.

Review Comment:
   ```suggestion
   We are excited to announce a new sink connector that enables writing data to 
Prometheus 
([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)).
 This articles introduces the main features of the connector, and the reasoning 
behind design decisions.
   ```



##########
docs/content/posts/2024-11-26-introducing-new-prometheus-connector.md:
##########
@@ -0,0 +1,201 @@
+---
+title:  "Introducing the new Prometheus connector"
+date: "2024-11-26T00:00:00.000Z"
+authors:
+- nicusX:
+  name: "Lorenzo Nicora"
+---
+
+
+We are excited to announce a new sink connector that enables writing data to 
Prometheus 
([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)).
 This articles introduces the main features of the connector, the reasoning 
behind design decisions.
+
+This connector allows writing data to Prometheus using the 
[Remote-Write](https://prometheus.io/docs/specs/remote_write_spec/) push 
interface, which lets you write time-series data to Prometheus at scale.
+
+## Motivations for a Prometheus connector
+
+Prometheus is an efficient time-series database optimized for building 
real-time dashboards and alerts, typically in combination with Grafana or other 
visualization tools.
+
+Prometheus is commonly used to monitor compute resources, IT infrastructure, 
Kubernetes clusters, applications, and cloud resources. It can also be used to 
observe your Flink cluster and Flink jobs. Flink already has [Metric 
Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/)
 to export metrics to Prometheus. 

Review Comment:
   ```suggestion
   Prometheus is commonly used to monitor compute resources, IT infrastructure, 
Kubernetes clusters, applications, and cloud resources. It can also be used to 
observe your Flink cluster and Flink jobs. Flink has existing [Metric 
Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/)
 to export metrics to Prometheus for this purpose. 
   ```



##########
docs/content/posts/2024-11-26-introducing-new-prometheus-connector.md:
##########
@@ -0,0 +1,201 @@
+---
+title:  "Introducing the new Prometheus connector"
+date: "2024-11-26T00:00:00.000Z"
+authors:
+- nicusX:
+  name: "Lorenzo Nicora"
+---
+
+
+We are excited to announce a new sink connector that enables writing data to 
Prometheus 
([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)).
 This articles introduces the main features of the connector, the reasoning 
behind design decisions.
+
+This connector allows writing data to Prometheus using the 
[Remote-Write](https://prometheus.io/docs/specs/remote_write_spec/) push 
interface, which lets you write time-series data to Prometheus at scale.
+
+## Motivations for a Prometheus connector
+
+Prometheus is an efficient time-series database optimized for building 
real-time dashboards and alerts, typically in combination with Grafana or other 
visualization tools.
+
+Prometheus is commonly used to monitor compute resources, IT infrastructure, 
Kubernetes clusters, applications, and cloud resources. It can also be used to 
observe your Flink cluster and Flink jobs. Flink already has [Metric 
Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/)
 to export metrics to Prometheus. 
+
+So, why do we need a connector?
+
+Prometheus can serve as a general-purpose observability time-series database, 
beyond traditional infrastructure monitoring. For example, it can be used to 
monitor IoT devices, sensors, connected cars, media streaming devices, and any 
resource that streams events or measurements continuously.
+
+Observability data from these use cases differs from metrics generated by 
compute resources. Events are pushed by devices instead of being scraped, 
resulting in irregular frequency. Devices may be connected via mobile networks 
or even Bluetooth, causing events from each device to follow different paths 
and arrive at different times. The frequency and cardinality of events emitted 
by these devices can be very high, making it challenging to derive insights 
directly. Finally, events often lack contextual information and require 
enrichment to add additional dimensions before being sent to a time-series 
database.
+
+### Flink as observability events pre-processor
+
+You can address the challenges above by pre-processing raw events with Flink. 
Aggregation over short time windows can reduce frequency and cardinality, 
filtering can remove noise, and enrichment with reference data can add 
additional context. These are all tasks that Flink can perform efficiently, and 
at scale.
+
+What has been missing until now is an easy, reliable way to write from Flink 
to Prometheus at scale.
+

Review Comment:
   This might be better merged into above section! 



##########
docs/content/posts/2024-11-26-introducing-new-prometheus-connector.md:
##########
@@ -0,0 +1,201 @@
+---
+title:  "Introducing the new Prometheus connector"
+date: "2024-11-26T00:00:00.000Z"
+authors:
+- nicusX:
+  name: "Lorenzo Nicora"
+---
+
+
+We are excited to announce a new sink connector that enables writing data to 
Prometheus 
([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)).
 This articles introduces the main features of the connector, the reasoning 
behind design decisions.
+
+This connector allows writing data to Prometheus using the 
[Remote-Write](https://prometheus.io/docs/specs/remote_write_spec/) push 
interface, which lets you write time-series data to Prometheus at scale.
+
+## Motivations for a Prometheus connector
+
+Prometheus is an efficient time-series database optimized for building 
real-time dashboards and alerts, typically in combination with Grafana or other 
visualization tools.
+
+Prometheus is commonly used to monitor compute resources, IT infrastructure, 
Kubernetes clusters, applications, and cloud resources. It can also be used to 
observe your Flink cluster and Flink jobs. Flink already has [Metric 
Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/)
 to export metrics to Prometheus. 
+
+So, why do we need a connector?
+
+Prometheus can serve as a general-purpose observability time-series database, 
beyond traditional infrastructure monitoring. For example, it can be used to 
monitor IoT devices, sensors, connected cars, media streaming devices, and any 
resource that streams events or measurements continuously.
+
+Observability data from these use cases differs from metrics generated by 
compute resources. Events are pushed by devices instead of being scraped, 
resulting in irregular frequency. Devices may be connected via mobile networks 
or even Bluetooth, causing events from each device to follow different paths 
and arrive at different times. The frequency and cardinality of events emitted 
by these devices can be very high, making it challenging to derive insights 
directly. Finally, events often lack contextual information and require 
enrichment to add additional dimensions before being sent to a time-series 
database.
+
+### Flink as observability events pre-processor
+
+You can address the challenges above by pre-processing raw events with Flink. 
Aggregation over short time windows can reduce frequency and cardinality, 
filtering can remove noise, and enrichment with reference data can add 
additional context. These are all tasks that Flink can perform efficiently, and 
at scale.
+
+What has been missing until now is an easy, reliable way to write from Flink 
to Prometheus at scale.
+
+### Non-trivial implementing an efficient Remote-Write client
+
+You could implement a sink from scratch or use AsyncIO to call the Prometheus 
Remote-Write HTTP endpoint, but you would need to manage all aspects yourself. 
Prometheus Remote-Write has no high-level client, so you would need to build on 
top of a low-level HTTP client. Additionally, Remote-Write can be inefficient 
unless writes are batched and parallelized. Error handling can be complex, and 
specifications demand strict behaviors (see [Strict Specifications, Lenient 
Implementations](#strict-specifications-lenient-implementations)).

Review Comment:
   Maybe we can add to the bold/bulleted list!
   
   e.g.
   
   * **Non-trivial implementation of efficient Remote-Write Client**. <Details>



##########
docs/content/posts/2024-11-26-introducing-new-prometheus-connector.md:
##########
@@ -0,0 +1,201 @@
+---
+title:  "Introducing the new Prometheus connector"
+date: "2024-11-26T00:00:00.000Z"
+authors:
+- nicusX:
+  name: "Lorenzo Nicora"
+---
+
+
+We are excited to announce a new sink connector that enables writing data to 
Prometheus 
([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)).
 This articles introduces the main features of the connector, the reasoning 
behind design decisions.
+
+This connector allows writing data to Prometheus using the 
[Remote-Write](https://prometheus.io/docs/specs/remote_write_spec/) push 
interface, which lets you write time-series data to Prometheus at scale.
+
+## Motivations for a Prometheus connector
+
+Prometheus is an efficient time-series database optimized for building 
real-time dashboards and alerts, typically in combination with Grafana or other 
visualization tools.
+
+Prometheus is commonly used to monitor compute resources, IT infrastructure, 
Kubernetes clusters, applications, and cloud resources. It can also be used to 
observe your Flink cluster and Flink jobs. Flink already has [Metric 
Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/)
 to export metrics to Prometheus. 
+
+So, why do we need a connector?
+
+Prometheus can serve as a general-purpose observability time-series database, 
beyond traditional infrastructure monitoring. For example, it can be used to 
monitor IoT devices, sensors, connected cars, media streaming devices, and any 
resource that streams events or measurements continuously.
+
+Observability data from these use cases differs from metrics generated by 
compute resources. Events are pushed by devices instead of being scraped, 
resulting in irregular frequency. Devices may be connected via mobile networks 
or even Bluetooth, causing events from each device to follow different paths 
and arrive at different times. The frequency and cardinality of events emitted 
by these devices can be very high, making it challenging to derive insights 
directly. Finally, events often lack contextual information and require 
enrichment to add additional dimensions before being sent to a time-series 
database.

Review Comment:
   Wonder if we could/should rephrase to make the benefits clearer. e.g.
   
   
   * **Events re-ordering**: Devices may be connected via mobile networks or 
even Bluetooth, causing events from each device to follow different paths and 
arrive at different times.
   * **Reduce cardinality**: The frequency and cardinality of events emitted by 
these devices can be very high, making it challenging to derive insights 
directly.
   * **Event Enrichment**: Events often lack contextual information and require 
enrichment to add additional dimensions before being sent to a time-series 
database.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Add blogpost for new Prometheus connector [flink-web]

Reply via email to