Liu created FLINK-39016:
---------------------------
Summary: Add configurable TTL for ExecutionGraph cache independent
of web refresh interval
Key: FLINK-39016
URL: https://issues.apache.org/jira/browse/FLINK-39016
Project: Flink
Issue Type: Improvement
Components: Runtime / REST
Reporter: Liu
h1. Motivation
Currently, the ExecutionGraphCache TTL (Time-To-Live) is tightly coupled with
web.refresh-interval configuration. This coupling creates issues in scenarios
where:
# State synchronization accuracy: When synchronizing job states (e.g.,
monitoring tools, external state collectors), users need real-time
ExecutionGraph data. The default web.refresh-interval (3000ms) causes stale
state information during synchronization, leading to inconsistent or outdated
state readings.
# Different refresh requirements: Users may want a slower Web UI refresh rate
(to reduce browser load) but need fresh ExecutionGraph data for REST API
consumers, or vice versa.
# High-frequency monitoring: In production environments with strict SLA
requirements, monitoring systems may need immediate access to the latest
ExecutionGraph state without waiting for the cache to expire.
Use Case Example: When setting ExecutionGraph cache to 0, every REST API
request retrieves the latest ExecutionGraph state, avoiding stale state
information during critical state synchronization operations.
h1. Proposed Changes
Add new configuration option web.execution-graph.cache-ttl
h1. Compatibility
# Backward Compatibility: This change introduces a new optional configuration.
Existing deployments will use the default value (0), which provides the most
accurate state information. Users who prefer the previous behavior can
explicitly set web.execution-graph.cache-ttl to match their
web.refresh-interval.
# API Compatibility: No breaking changes to existing APIs.
# Configuration Migration: No migration required. The new configuration is
additive
--
This message was sent by Atlassian Jira
(v8.20.10#820010)