[
https://issues.apache.org/jira/browse/FLINK-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595044#comment-14595044
]
ASF GitHub Bot commented on FLINK-2221:
---------------------------------------
Github user mbalassi commented on the pull request:
https://github.com/apache/flink/pull/747#issuecomment-113898364
Thanks for the design outline guys, it looks great. Two minor comments in
terms of implementation:
* Let us make some more emphasis on the architecture of discarding or
compaction old checkpoint data. This seems a minor issue, but we have seen
during the recent release testing that it has implications. [1] In the current
version the `JobManager` discards old state, so it needs access to it. If I
understand it correctly that behavior is intact after this PR.
* API: `KeyedDataStream` should be feasible if we really make it a more
general version of data stream. The interplay with windowed and connected
streams is interesting. The reason why e.g. a windowed and grouping fits nicely
is that the windowed groups contain only elements from one group. You propose
the same design for keyed streams as well?
[1] https://issues.apache.org/jira/browse/FLINK-2221
> Checkpoints to "file://" are not cleaned up
> -------------------------------------------
>
> Key: FLINK-2221
> URL: https://issues.apache.org/jira/browse/FLINK-2221
> Project: Flink
> Issue Type: Bug
> Components: Streaming
> Reporter: Aljoscha Krettek
>
> If you think about it, this could never work. The state handle cleanup logic
> happens purely on the JobManager. So what happens is that the TaskManagers
> create state in some directory, let's say /tmp/checkpoints, on the
> TaskManager. For cleanup, the JobManager gets the state handle and calls
> discard (on the JobManager), this tries to cleanup the state in
> /tmp/checkpoints, but of course, there is nothing there since we are still on
> the JobManager.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)