A. Sophie Blee-Goldman created KAFKA-14444:
----------------------------------------------

             Summary: Simplify user experience of customizing partitioning 
strategy in Streams
                 Key: KAFKA-14444
                 URL: https://issues.apache.org/jira/browse/KAFKA-14444
             Project: Kafka
          Issue Type: New Feature
          Components: streams
            Reporter: A. Sophie Blee-Goldman


The current process of plugging a custom partitioning scheme across a Streams 
application is fairly intensive and extremely error prone. While defining their 
topology users must pay close attention to where an operator/node may be 
connected to or creating a topic that will be produced to, or else print out 
their topology description and try to locate all sink nodes in this way. If 
they miss passing in their custom partitioner to one or more such locations in 
the topology, everything downstream will be affected by the 
inconsistent/unintended partitioning scheme.

It can also be easy for users to miss this process entirely and try to 
customize the partitioning scheme via the producer config. This does not work, 
and unfortunately results in a runtime exception that's difficult for users to 
interpret. Ideally we would provide a similar config for Streams where users 
could define a default implementation of the StreamPartitioner interface.

...unfortunately, this is not so straightforward. Unlike the case of the 
Producer config, where there is a clearly defined key and value type, there's 
no guarantee each sink node requiring the custom partitioner deals with the 
same key/value type as the others.

We could utilize the default.key/value configs for this, and only require users 
to plug in their partitioner where the key/value types differ from the default, 
but this would likely limit the usefulness of a default partitioner 
significantly. We could push this to the user to write a generic implementation 
class with type checking and handling, but this would be pretty awkward and 
error prone as well.

Either way this will take some thought, which is why the idea was pulled from 
the proposal in KIP-878 and left for a follow-up KIP



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to