Till Rohrmann created FLINK-14043: ------------------------------------- Summary: SavepointMigrationTestBase is super slow Key: FLINK-14043 URL: https://issues.apache.org/jira/browse/FLINK-14043 Project: Flink Issue Type: Bug Components: Runtime / State Backends, Tests Affects Versions: 1.9.0, 1.8.1, 1.10.0 Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 1.10.0, 1.9.1, 1.8.3
The subclasses of {{SavepointMigrationTestBase}} take super long to execute. On my local machine * {{TypeSerializerSnapshotMigrationITCase}} takes 2min 30s * {{StatefulJobWBroadcastStateMigrationITCase}} takes 1min 45s * {{StatefulJobSavepointMigrationITCase}} takes 2min 5s to execute. The reasons for the long runtimes seem to be that we are using the {{AccumulatorCountingSink}} which uses the accumulators to signal when a job is done. Since the accumulators are being sent with the TM heartbeats, the heartbeat interval how fast the client realizes that the job can be shut down. The default heartbeat interval is {{10 s}} and hence it takes always at least 10 seconds until the client stops the job. I suggest to decrease the heartbeat interval in the {{SavepointMigrationTestBase}} to 500ms in order to speed up the tests. On my machine the test runtimes with this settings are: * {{TypeSerializerSnapshotMigrationITCase}} takes 13s * {{StatefulJobWBroadcastStateMigrationITCase}} takes 10s * {{StatefulJobSavepointMigrationITCase}} takes 11s -- This message was sent by Atlassian Jira (v8.3.2#803003)