## What is the purpose of the change

With the re-design of the record writer interaction with the 
result(sub)partitions, flush requests can currently pile up in these scenarios:
- a previous flush request has not been completely handled yet and/or is still 
enqueued or
- the network stack is still polling from this subpartition and doesn't need a 
new notification

These lead to increased notifications in low latency settings (low output 
flusher intervals) which can be avoided.

## Brief change log

- do not flush (again) in the scenarios mentioned above, relying on 
`flushRequested` and the `buffer` queue size
- add intensive sanity checks to `SpillingAdaptiveSpanningRecordDeserializer`
- several smaller improvement hotfixes (please see the individual commits)

## Verifying this change

This change is already covered by existing tests plus a few new tests in 
`PipelinedSubpartitionTest`.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): **no**
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
  - The serializers: **no**
  - The runtime per-record code paths (performance sensitive): **yes** 
(depending on output flusher interval, rather per buffer)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: **no**
  - The S3 file system connector: **no**

## Documentation

  - Does this pull request introduce a new feature? **no**
  - If yes, how is the feature documented? **JavaDocs**


[ Full content available at: https://github.com/apache/flink/pull/6692 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to