Maximilian Michels created FLINK-31400:
------------------------------------------
Summary: Add autoscaler integration for Iceberg source
Key: FLINK-31400
URL: https://issues.apache.org/jira/browse/FLINK-31400
Project: Flink
Issue Type: New Feature
Components: Autoscaler, Kubernetes Operator
Reporter: Maximilian Michels
Fix For: kubernetes-operator-1.5.0
A very critical part in the scaling algorithm is setting the source processing
correctly such that the Flink pipeline can keep up with the ingestion rate. The
autoscaler does that by looking at the {{pendingRecords}} Flink source metric.
Even if that metric is not available, the source can still be sized according
to the busyTimeMsPerSecond metric, but there will be no backlog information
available. For Kafka, the autoscaler also determines the number of partitions
to avoid scaling higher than the maximum number of partitions.
In order to support a wider range of use cases, we should investigate an
integration with the Iceberg source. As far as I know, it does not expose the
pedingRecords metric, nor does the autoscaler know about other constraints,
e.g. the maximum number of open files.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)