[GitHub] [flink] AHeise commented on pull request #13551: [FLINK-19520][configuration] Add randomization of checkpoint config.

GitBox Mon, 07 Dec 2020 13:39:51 -0800


AHeise commented on pull request #13551:
URL: https://github.com/apache/flink/pull/13551#issuecomment-740196009



   > The randomization here only applies to tests that use 
`MiniClusterResource`, right?
   > I think this is a significant limitation.
   > 
   > Can we overcome it by reading some system property at the time of 
`ConfigOption` creation?
   > 
   > ```
   >    public static final ConfigOption<Boolean> ENABLE_UNALIGNED =
   >            ConfigOptions.key("execution.checkpointing.unaligned")
   >                    .booleanType()
   >                    
.defaultValue(System.getProperty("execution.checkpointing.unaligned", false)) 
// <-- here
   > ```
   > 
   > The property can be set either
   > 
   >     * by maven/CI at the very beginning to a random value (seed="nanoTime")
   > 
   >     * or by the developer or to a fixed value
   > 
   > 
   > I think it will be easier to reason about if it's set for the whole build 
(and logged in the beginning).
   > 
   > WDYT?
   
   That was the very first approach but we didn't like that this means that all 
tests run only with that particular configuration, which has two issues:
   - For a large enough test matrix, a certain configuration may happen very 
rarely. Assuming we have 10 different configuration options (aligned, unaligned 
with 0, 10s, 1m timeout x 0, 1kb, 1m size limit), then we have 10% chance for a 
particular combination to occur. But even when running 10 tests, you have 3.4% 
chance of a specific combination not occurring. If you add that to rarely 
occurring instabilities that are even now hard to detect, the overall goal of 
having a good coverage is probably not reached. It becomes worse if we have 
more interdependent value combinations related to checkpointing like different 
DSTL settings.
   - Even for more commonly occurring issues that happen while refactoring/new 
feature, it's disadvantage to just have one configuration for all tests. You 
have to manually cycle through the relevant settings through setting the system 
property to "force" your luck. Here having more or less all combinations being 
executed by the same AZP run on your feature branch will probably already find 
you quite a few issues.
   
   Btw (but this is an orthogonal discussion), I'd like to bind the 
randomization seed to commit id instead of timestamp, such that it's easy for 
us to debug into any issue (checkout particular commit and go). However, the 
current implementation might not be ideal in that regard 
(`EnvironmentalInformation` might not be updated without a full build).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] AHeise commented on pull request #13551: [FLINK-19520][configuration] Add randomization of checkpoint config.

Reply via email to