[
https://issues.apache.org/jira/browse/FLINK-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977489#comment-15977489
]
Stephan Ewen commented on FLINK-6141:
-------------------------------------
I had a chat with [~aljoscha] about this:
I think that we need to work with a model without buffering / spilling data. It
is not feasible and efficient to buffer/spill data from the main input streams.
Any design we come up with will have to work without buffering/spilling, or
will make Flink very fragile.
> Add buffering service for stream operators
> ------------------------------------------
>
> Key: FLINK-6141
> URL: https://issues.apache.org/jira/browse/FLINK-6141
> Project: Flink
> Issue Type: Sub-task
> Components: DataStream API
> Reporter: Aljoscha Krettek
>
> As mentioned in
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API
> we need a way of buffering incoming elements until a side input that is
> required for processing them is ready.
> There has to be an implementation for non-keyed operators and for keyed
> operators because in keyed operators we need to ensure that we store the
> buffered elements in the correct key group when checkpointing.
> For the interface, I propose this:
> {code}
> @PublicEvolving
> public interface ElementBuffer<T, N> {
> /**
> * Adds the given element to the buffer for the given namespace.
> */
> void add(N namespace, T element);
> /**
> * Returns an {@code Iterable} over all buffered elements for the given
> namespace.
> */
> Iterable<T> values(N namespace);
> /**
> * Clears all buffered elements for the given namespace.
> */
> void clear(N namespace);
> }
> {code}
> {{AbstractStreamOperator}} would provide a method {{getElementBuffer()}} that
> would return the appropriate implementation for a non-keyed or keyed operator.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)