Github user NicoK commented on the issue: https://github.com/apache/flink/pull/5550 Since the tests go through various different scenarios, that's the natural place to also verify statistics which should be aligned with the real world despite the overhead during changes. The test failure indeed is interesting but unrelated as proposed - I created [FLINK-8750](https://issues.apache.org/jira/browse/FLINK-8750) for this.
---