first commit comes from https://github.com/apache/flink/pull/6697

This solves GC issues for cases with low latency (small flushTimeout) and many 
output channels and generally significantly improves low latency performance.
    
OutputFlusher remains as for now to trigger flushes for local subpartitions.
    
Registering periodic flushes in netty is unfortunately not the most beautiful 
thing in the world at the moment. It is complicated by two things:
    1. we do know about flushTimeout only in flink-streaming-java and 
StreamTask, which is long after the point when we are actually creating 
subpartitions
    2. we do not know before hand which subpartitions will be local and which 
will be remote

![Benchmark 
results](https://docs.google.com/spreadsheets/d/e/2PACX-1vQ4ImkIhEVyd0JuC0_KBzSiZk1ugqRYYJ29fftj8f7bvQHsyNTrS9PBS2g7YaI6q7kfyHXpWWsnb5lq/pubchart?oid=1194867281&format=image)

Average throughput is significantly higher only for extreme cases, however the 
very important improvement here is solving (mitigating?) current GC issues, 
which is visible on the "min" graph. Without this change 1ms latency with 1000+ 
output channels suffers from frequent very long GC pauses.

## Verifying this change

This change is cover by existing network stack tests, stress tests and almost 
all it cases.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes / **no**)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
  - The serializers: (yes / **no** / don't know)
  - The runtime per-record code paths (performance sensitive): (**yes** / no / 
don't know)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
  - The S3 file system connector: (yes / **no** / don't know)

## Documentation

  - Does this pull request introduce a new feature? (yes / **no**)
  - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)


[ Full content available at: https://github.com/apache/flink/pull/6698 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to