[
https://issues.apache.org/jira/browse/FLINK-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642115#comment-14642115
]
ASF GitHub Bot commented on FLINK-2406:
---------------------------------------
GitHub user StephanEwen opened a pull request:
https://github.com/apache/flink/pull/938
[FLINK-2406] [FLINK-2402] Abstract the BarrierBuffers and add a "at least
once" version (BarrierTracker)
Currently, the checkpointing mechanism provides "exactly once" guarantees.
Part of that is the step that temporarily "aligns" the data streams, performed
by the `BarrierBuffer`. This step increases the tuple latency temporarily.
By offering a version that does not provide "exactly-once", but only
"at-least-once", we can avoid the latency increase. This is relevant for
super-low-latency applications (< 10 ms) that tolerate duplicates.
This pull request does:
- Add an interface for the functionality of the barrier buffer, to allow
adding different implementations
- Add the `BarrierTracker` which only tracks checkpoint barriers, without
aligning the streams.
- Add broader tests for the BarrierBuffer, including trailing data and
barrier races.
- Checkpoint barriers are handled by the buffer directly, rather than
being returned and re-injected.
- Simplify logic in the BarrierBuffer and fix certain corner cases.
- Give access to spill directories properly via I/O manager, rather than
via GlobalConfiguration singleton.
- Rename the "BarrierBufferIOTest" to "BarrierBufferMassiveRandomTest"
- A lot of code style/robustness fixes (proplery define constants,
visibility, exception signatures)
@gyfora and @uce : The reworked variant of the BarrierBuffer eventually
returns `null` when all buffers and events have been consumed. The previous
impementation repeatedly returned the `EndOfPartitionEvent`. That seemed
inconsistent with the implementation of the "vanilla" InputGate, which
eventually returns `null` when all partitions have finished.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/StephanEwen/incubator-flink latency
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/938.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #938
----
commit 9d9e34e85c812eedb7b0f6365952c853fc531627
Author: Stephan Ewen <[email protected]>
Date: 2015-07-12T17:33:38Z
[tests] Add a manual test to evaluate impact of checkpointing on latency
commit 4cafad08a1b64bcaf0c95fe0b211eb740f6774b2
Author: Stephan Ewen <[email protected]>
Date: 2015-07-26T16:58:37Z
[FLINK-2406] [streaming] Abstract and improve stream alignment via the
BarrierBuffer
- Add an interface for the functionaliy of the barrier buffer (for later
addition of other implementatiions)
- Add broader tests for the BarrierBuffer, inluding trailing data and
barrier races.
- Checkpoint barriers are handled by the buffer directly, rather than
being returned and re-injected.
- Simplify logic in the BarrierBuffer and fix certain corner cases.
- Give access to spill directories properly via I/O manager, rather than
via GlobalConfiguration singleton.
- Rename the "BarrierBufferIOTest" to "BarrierBufferMassiveRandomTest"
- A lot of code style/robustness fixes (proplery define constants,
visibility, exception signatures)
commit ba9e5a3cc1aacb5078fdc50fee12f60ccefd2563
Author: Stephan Ewen <[email protected]>
Date: 2015-07-26T17:05:30Z
[FLINK-2402] [streaming] Add a stream checkpoint barrier tracker.
The BarrierTracker is non-blocking and only counts barriers.
That way, it does not increase latency of records in the stream, but can
only be
used to obtain "at least once" processing guarantees.
----
> Abstract BarrierBuffer to an exchangeable BarrierHandler
> --------------------------------------------------------
>
> Key: FLINK-2406
> URL: https://issues.apache.org/jira/browse/FLINK-2406
> Project: Flink
> Issue Type: Sub-task
> Components: Streaming
> Affects Versions: 0.10
> Reporter: Stephan Ewen
> Assignee: Stephan Ewen
> Fix For: 0.10
>
>
> We need to make the Checkpoint handling pluggable, to allow us to use
> different implementations:
> - BarrierBuffer for "exactly once" processing. This inevitably introduces a
> bit of latency.
> - BarrierTracker for "at least once" processing, with no added latency.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)