[
https://issues.apache.org/jira/browse/FLINK-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402300#comment-15402300
]
Ufuk Celebi commented on FLINK-4154:
------------------------------------
Reverted in 0274f9c (release-1.1) and 0f92a6b (master).
> Correction of murmur hash breaks backwards compatibility
> --------------------------------------------------------
>
> Key: FLINK-4154
> URL: https://issues.apache.org/jira/browse/FLINK-4154
> Project: Flink
> Issue Type: Bug
> Components: Distributed Coordination
> Affects Versions: 1.1.0
> Reporter: Till Rohrmann
> Assignee: Greg Hogan
> Priority: Blocker
> Fix For: 1.1.0
>
>
> The correction of Flink's murmur hash with commit [1], breaks Flink's
> backwards compatibility with respect to savepoints. The reason is that the
> changed murmur hash which is used to partition elements in a {{KeyedStream}}
> changes the mapping from keys to sub tasks. This changes the assigned key
> spaces for a sub task. Consequently, an old savepoint (version 1.0) assigns
> states with a different key space to the sub tasks.
> I think that this must be fixed for the upcoming 1.1 release. I see two
> options to solve the problem:
> - revert the changes, but then we don't know how the flawed murmur hash
> performs
> - develop tooling to repartition state of old savepoints. This is probably
> not trivial since a keyed stream can also contain non-partitioned state which
> is not partitionable in all cases. And even if only partitioned state is
> used, we would need some kind of special operator which can repartition the
> state wrt the key.
> I think that the latter option requires some more thoughts and is thus
> unlikely to be done before the release 1.1. Therefore, as a workaround, I
> think that we should revert the murmur hash changes.
> [1]
> https://github.com/apache/flink/commit/641a0d436c9b7a34ff33ceb370cf29962cac4dee
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)