[jira] [Commented] (FLINK-8181) Kafka producer's FlinkFixedPartitioner returns different partitions when a target topic is rescaled

ASF GitHub Bot (JIRA) Sun, 03 Dec 2017 11:16:51 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-8181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276036#comment-16276036
 ]


ASF GitHub Bot commented on FLINK-8181:
---------------------------------------

Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/5108
  
    I don't quite understand this bug and fix. Any failure/recovery in Flink 
will basically result in a loss of the cache. That means this does not give you 
any added guarantees here - only stickyness between recoveries.
    
    I would follow @aljoscha's advise and check what Kafka's own behavior for 
this is. The fact that Flink switches the partition when Kafka adds more topics 
may also be considered correct. Making this stateful and sticky to its 
first-ever partitioning scheme means also that when scaling Kafka out, one 
needs to savepoint/restore the Flink job and tell it to not pick up previous 
state. Quite tricky as well.
    
    Is this an actual user-reported problem?
    
    If this code is added, it needs to be performance optimized:
      - map.contains() followed by map.get() is an antipattern (can we guard 
for that via spotbugs?) - two hash map accesses. Just do map.get() and check 
for null, that is only one hash map access. 
      - Because a very common case will be having one topic, the code should be 
optimized for one topic, meaning memoization if the latest value/access to save 
the hashmap lookup when the method is always invoked with the same string 
object.


> Kafka producer's FlinkFixedPartitioner returns different partitions when a 
> target topic is rescaled
> ---------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-8181
>                 URL: https://issues.apache.org/jira/browse/FLINK-8181
>             Project: Flink
>          Issue Type: Bug
>          Components: Kafka Connector
>    Affects Versions: 1.4.0, 1.5.0
>            Reporter: Tzu-Li (Gordon) Tai
>            Assignee: Tzu-Li (Gordon) Tai
>            Priority: Blocker
>             Fix For: 1.4.0, 1.5.0
>
>
> On fixing FLINK-6288 and migrating the original {{FlinkFixedPartitioner}} to 
> the new partitioning API (commit 9ed9b68397b51bfd2b0f6e532212a82f771641bd), 
> the {{FlinkFixedPartitioner}} no longer returns identical target partitions 
> once a target topic is rescaled.
> This results in a behavioral regression when the {{FlinkFixedPartitioner}} is 
> used.
> The {{FlinkFixedPartitionerTest}} should also be strengthened to cover the 
> target topic rescaling case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-8181) Kafka producer's FlinkFixedPartitioner returns different partitions when a target topic is rescaled

Reply via email to