GitHub user uce opened a pull request:
https://github.com/apache/flink/pull/356
[FLINK-1350][FLINK-1359][Distributed runtime] Add blocking result partition
variant
- **Renames** runtime intermediate result classes (after an offline
discussion with @StephanEwen I realized that calling subpartitions *queue* was
rather confusing):
a) Removes "Intermediate" prefix
b) Queue => Subpartition
c) Iterator => View
Documentation is coming up soon in FLINK-1373.
- **[FLINK-1350](https://issues.apache.org/jira/browse/FLINK-1350)**: Adds
a *spillable result subpartition variant* for **BLOCKING results**, which
writes data to memory first and starts to spill (asynchronously) if not enough
memory is available to produce the result in-memory only.
Receiving tasks of BLOCKING results are only deployed after *all*
partitions have been fully produced. PIPELINED and BLOCKING results can not be
mixed.
@StephanEwen, what is your opinion on these two points? We could also start
deploying receivers of blocking results as soon as a single partition is
finished and then send the update RPC calls later. The current solution is
simpler and results in less RPC calls (single per parallel producer).
- **[FLINK-1359](https://issues.apache.org/jira/browse/FLINK-1359)**: Adds
**simple state tracking** to result partitions with notifications after
partitions/subpartitions have been consumed. Each partition has to be consumed
at least once before it can be released.
---
@StephanEwen, I need some hints on how to integrate this with the
NepheleJobGraphGenerator and the BinaryUnionNode to set the blocking result
variants for results with multiple consumers only. We also need to sync for
upcoming fault tolerance features (Do we have a seperate issue for this?).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/uce/incubator-flink flink-1350-blocking
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/356.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #356
----
commit 73460bc88e5b447a7f7f29b499683f403ba0dd93
Author: Ufuk Celebi <[email protected]>
Date: 2015-01-06T16:11:08Z
[FLINK-1350][FLINK-1359][Distributed runtime] Add blocking result partition
variant
- Renames runtime intermediate result classes:
a) Removes "Intermediate" prefix
b) Queue => Subpartition
c) Iterator => View
- [FLINK-1350] Adds a spillable result subpartition variant for BLOCKING
results, which writes data to memory first and starts to spill
(asynchronously) if not enough memory is available to produce the
result in-memory only.
Receiving tasks of BLOCKING results are only deployed after *all*
partitions have been fully produced. PIPELINED and BLOCKING results can
not
be mixed.
- [FLINK-1359] Adds simple state tracking to result partitions with
notifications after partitions/subpartitions have been consumed. Each
partition has to be consumed at least once before it can be released.
Currently there is no notion of historic intermediate results, i.e.
results
are released as soon as they are consumed.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---