hlteoh37 commented on code in PR #766: URL: https://github.com/apache/flink-web/pull/766#discussion_r1858129566
########## docs/content/posts/2024-11-26-introducing-new-prometheus-connector.md: ########## @@ -0,0 +1,201 @@ +--- +title: "Introducing the new Prometheus connector" +date: "2024-11-26T00:00:00.000Z" +authors: +- nicusX: + name: "Lorenzo Nicora" +--- + + +We are excited to announce a new sink connector that enables writing data to Prometheus ([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)). This articles introduces the main features of the connector, the reasoning behind design decisions. Review Comment: ```suggestion We are excited to announce a new sink connector that enables writing data to Prometheus ([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)). This articles introduces the main features of the connector, and the reasoning behind design decisions. ``` ########## docs/content/posts/2024-11-26-introducing-new-prometheus-connector.md: ########## @@ -0,0 +1,201 @@ +--- +title: "Introducing the new Prometheus connector" +date: "2024-11-26T00:00:00.000Z" +authors: +- nicusX: + name: "Lorenzo Nicora" +--- + + +We are excited to announce a new sink connector that enables writing data to Prometheus ([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)). This articles introduces the main features of the connector, the reasoning behind design decisions. + +This connector allows writing data to Prometheus using the [Remote-Write](https://prometheus.io/docs/specs/remote_write_spec/) push interface, which lets you write time-series data to Prometheus at scale. + +## Motivations for a Prometheus connector + +Prometheus is an efficient time-series database optimized for building real-time dashboards and alerts, typically in combination with Grafana or other visualization tools. + +Prometheus is commonly used to monitor compute resources, IT infrastructure, Kubernetes clusters, applications, and cloud resources. It can also be used to observe your Flink cluster and Flink jobs. Flink already has [Metric Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/) to export metrics to Prometheus. Review Comment: ```suggestion Prometheus is commonly used to monitor compute resources, IT infrastructure, Kubernetes clusters, applications, and cloud resources. It can also be used to observe your Flink cluster and Flink jobs. Flink has existing [Metric Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/) to export metrics to Prometheus for this purpose. ``` ########## docs/content/posts/2024-11-26-introducing-new-prometheus-connector.md: ########## @@ -0,0 +1,201 @@ +--- +title: "Introducing the new Prometheus connector" +date: "2024-11-26T00:00:00.000Z" +authors: +- nicusX: + name: "Lorenzo Nicora" +--- + + +We are excited to announce a new sink connector that enables writing data to Prometheus ([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)). This articles introduces the main features of the connector, the reasoning behind design decisions. + +This connector allows writing data to Prometheus using the [Remote-Write](https://prometheus.io/docs/specs/remote_write_spec/) push interface, which lets you write time-series data to Prometheus at scale. + +## Motivations for a Prometheus connector + +Prometheus is an efficient time-series database optimized for building real-time dashboards and alerts, typically in combination with Grafana or other visualization tools. + +Prometheus is commonly used to monitor compute resources, IT infrastructure, Kubernetes clusters, applications, and cloud resources. It can also be used to observe your Flink cluster and Flink jobs. Flink already has [Metric Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/) to export metrics to Prometheus. + +So, why do we need a connector? + +Prometheus can serve as a general-purpose observability time-series database, beyond traditional infrastructure monitoring. For example, it can be used to monitor IoT devices, sensors, connected cars, media streaming devices, and any resource that streams events or measurements continuously. + +Observability data from these use cases differs from metrics generated by compute resources. Events are pushed by devices instead of being scraped, resulting in irregular frequency. Devices may be connected via mobile networks or even Bluetooth, causing events from each device to follow different paths and arrive at different times. The frequency and cardinality of events emitted by these devices can be very high, making it challenging to derive insights directly. Finally, events often lack contextual information and require enrichment to add additional dimensions before being sent to a time-series database. + +### Flink as observability events pre-processor + +You can address the challenges above by pre-processing raw events with Flink. Aggregation over short time windows can reduce frequency and cardinality, filtering can remove noise, and enrichment with reference data can add additional context. These are all tasks that Flink can perform efficiently, and at scale. + +What has been missing until now is an easy, reliable way to write from Flink to Prometheus at scale. + Review Comment: This might be better merged into above section! ########## docs/content/posts/2024-11-26-introducing-new-prometheus-connector.md: ########## @@ -0,0 +1,201 @@ +--- +title: "Introducing the new Prometheus connector" +date: "2024-11-26T00:00:00.000Z" +authors: +- nicusX: + name: "Lorenzo Nicora" +--- + + +We are excited to announce a new sink connector that enables writing data to Prometheus ([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)). This articles introduces the main features of the connector, the reasoning behind design decisions. + +This connector allows writing data to Prometheus using the [Remote-Write](https://prometheus.io/docs/specs/remote_write_spec/) push interface, which lets you write time-series data to Prometheus at scale. + +## Motivations for a Prometheus connector + +Prometheus is an efficient time-series database optimized for building real-time dashboards and alerts, typically in combination with Grafana or other visualization tools. + +Prometheus is commonly used to monitor compute resources, IT infrastructure, Kubernetes clusters, applications, and cloud resources. It can also be used to observe your Flink cluster and Flink jobs. Flink already has [Metric Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/) to export metrics to Prometheus. + +So, why do we need a connector? + +Prometheus can serve as a general-purpose observability time-series database, beyond traditional infrastructure monitoring. For example, it can be used to monitor IoT devices, sensors, connected cars, media streaming devices, and any resource that streams events or measurements continuously. + +Observability data from these use cases differs from metrics generated by compute resources. Events are pushed by devices instead of being scraped, resulting in irregular frequency. Devices may be connected via mobile networks or even Bluetooth, causing events from each device to follow different paths and arrive at different times. The frequency and cardinality of events emitted by these devices can be very high, making it challenging to derive insights directly. Finally, events often lack contextual information and require enrichment to add additional dimensions before being sent to a time-series database. + +### Flink as observability events pre-processor + +You can address the challenges above by pre-processing raw events with Flink. Aggregation over short time windows can reduce frequency and cardinality, filtering can remove noise, and enrichment with reference data can add additional context. These are all tasks that Flink can perform efficiently, and at scale. + +What has been missing until now is an easy, reliable way to write from Flink to Prometheus at scale. + +### Non-trivial implementing an efficient Remote-Write client + +You could implement a sink from scratch or use AsyncIO to call the Prometheus Remote-Write HTTP endpoint, but you would need to manage all aspects yourself. Prometheus Remote-Write has no high-level client, so you would need to build on top of a low-level HTTP client. Additionally, Remote-Write can be inefficient unless writes are batched and parallelized. Error handling can be complex, and specifications demand strict behaviors (see [Strict Specifications, Lenient Implementations](#strict-specifications-lenient-implementations)). Review Comment: Maybe we can add to the bold/bulleted list! e.g. * **Non-trivial implementation of efficient Remote-Write Client**. <Details> ########## docs/content/posts/2024-11-26-introducing-new-prometheus-connector.md: ########## @@ -0,0 +1,201 @@ +--- +title: "Introducing the new Prometheus connector" +date: "2024-11-26T00:00:00.000Z" +authors: +- nicusX: + name: "Lorenzo Nicora" +--- + + +We are excited to announce a new sink connector that enables writing data to Prometheus ([FLIP-312](https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector)). This articles introduces the main features of the connector, the reasoning behind design decisions. + +This connector allows writing data to Prometheus using the [Remote-Write](https://prometheus.io/docs/specs/remote_write_spec/) push interface, which lets you write time-series data to Prometheus at scale. + +## Motivations for a Prometheus connector + +Prometheus is an efficient time-series database optimized for building real-time dashboards and alerts, typically in combination with Grafana or other visualization tools. + +Prometheus is commonly used to monitor compute resources, IT infrastructure, Kubernetes clusters, applications, and cloud resources. It can also be used to observe your Flink cluster and Flink jobs. Flink already has [Metric Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/) to export metrics to Prometheus. + +So, why do we need a connector? + +Prometheus can serve as a general-purpose observability time-series database, beyond traditional infrastructure monitoring. For example, it can be used to monitor IoT devices, sensors, connected cars, media streaming devices, and any resource that streams events or measurements continuously. + +Observability data from these use cases differs from metrics generated by compute resources. Events are pushed by devices instead of being scraped, resulting in irregular frequency. Devices may be connected via mobile networks or even Bluetooth, causing events from each device to follow different paths and arrive at different times. The frequency and cardinality of events emitted by these devices can be very high, making it challenging to derive insights directly. Finally, events often lack contextual information and require enrichment to add additional dimensions before being sent to a time-series database. Review Comment: Wonder if we could/should rephrase to make the benefits clearer. e.g. * **Events re-ordering**: Devices may be connected via mobile networks or even Bluetooth, causing events from each device to follow different paths and arrive at different times. * **Reduce cardinality**: The frequency and cardinality of events emitted by these devices can be very high, making it challenging to derive insights directly. * **Event Enrichment**: Events often lack contextual information and require enrichment to add additional dimensions before being sent to a time-series database. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org