[ https://issues.apache.org/jira/browse/FLINK-38291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated FLINK-38291: ----------------------------------- Labels: pull-request-available (was: ) > Reduce thread lock overhead for Flink UI REST handlers > ------------------------------------------------------ > > Key: FLINK-38291 > URL: https://issues.apache.org/jira/browse/FLINK-38291 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST > Affects Versions: 2.1 > Reporter: Yunfeng Zhou > Priority: Major > Labels: pull-request-available > > In some of the Flink jobs in our company, we found that if the job has a > sophisticated logic and the parallelism (number of subtasks) is about 512 or > 1024, it may took more than one minute for the Flink UI to display the DAG of > the job. > Debugging into the corresponding REST handlers, we found that the latency is > caused by repeated visits to synchronized methods like MetricStore# > getSubtaskMetricStore. When invoking such methods, the thread might need to > wait for other synchronized methods to release the lock before it can enter > the method, and such overhead accumulates when the invocation is repeated. > Thus we propose to reduce the number of visits to these synchronized methods > to reduce the latency for DAG displaying. -- This message was sent by Atlassian Jira (v8.20.10#820010)