[jira] [Commented] (FLINK-19928) Introduce test configuration to detect instabilities better

Robert Metzger (Jira) Wed, 20 Jan 2021 01:41:18 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-19928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268477#comment-17268477
 ]


Robert Metzger commented on FLINK-19928:
----------------------------------------

I haven't spend more time on this after I created the ticket.

But I just went through all documented config options, proposing the following 
configuration:

heartbeat.interval: 1000 (default 10000)
metrics.fetcher.update-interval: 1000 (default 10000)
metrics.latency.interval: 1000 (default 0)
metrics.system-resource: true (default false)
metrics.system-resource-probing-interval: 1000 (default 5000)

Randomize these configuration keys:
taskmanager.network.blocking-shuffle.compression.enabled: true
taskmanager.network.blocking-shuffle.type: file / mmap
taskmanager.network.detailed-metrics: true
taskmanager.network.netty.transport: epoll / nio

WDYT?

> Introduce test configuration to detect instabilities better
> -----------------------------------------------------------
>
>                 Key: FLINK-19928
>                 URL: https://issues.apache.org/jira/browse/FLINK-19928
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination, Tests
>            Reporter: Robert Metzger
>            Priority: Major
>
> As part of debugging FLINK-19805, I noticed that invalid system states 
> sometimes depend on configuration values.
> For example the "heartbeat.interval" is configured to 10 seconds by default. 
> Many tests are not running that long, making it difficult to find test 
> failures related to the heartbeat.
> Similarly, to intervals, also retry configurations can cause failures to be 
> hidden.
> It will be difficult to spread this to all tests, but adding it to the 
> {{MiniClusterResource}} would be a start.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-19928) Introduce test configuration to detect instabilities better

Reply via email to