John Roesler created KAFKA-8396:
-----------------------------------
Summary: Clean up Transformer API
Key: KAFKA-8396
URL: https://issues.apache.org/jira/browse/KAFKA-8396
Project: Kafka
Issue Type: Improvement
Components: streams
Reporter: John Roesler
Currently, KStream operators transformValues and flatTransformValues disable
context forwarding, and force operators to just return the new values.
The reason is that we wanted to prevent the key from changing, since the whole
point of a `xValues` transformation is that we _do not_ change the key, and
hence don't need to repartition.
However, the chosen mechanism has some drawbacks: The Transform concept is
basically a way to plug in a custom Processor within the Streams DSL, but these
restrictions make it more like a MapValues with access to the context. For
example, even though you can still schedule punctuations, there's no way to
forward values as a result of them. So, as a user, it's hard to build a mental
model of how to use a TransformValues (because it's not quite a Transformer and
not quite a Mapper).
Also, logically, a Transformer can call forward as much as it wants, so a
Transformer and a FlatTransformer are effectively the same thing. Then, we also
have TransformValues and FlatTransformValues that are also two more versions of
the same thing, just to implement the key restrictions. Internally, some of
these can send downstream by returning OR forwarding, and others can only
return. It's a lot for users to keep in mind.
We can clean up this API significantly by just allowing all transformers to
call `forward`. In the `Values` case, we can wrap the ProcessorContext in one
that checks the key is `equal` to the one that got passed in (i.e., saves a
reference and enforces equality with that reference in any call to `forward`).
Then, we can actually deprecate the `*ValueTransformer*` interfaces and remove
the restriction about calling forward.
We can consider a further cleanup (TBD) to deprecate the existing Transformer
interface entirely, and replace it with one with a `void` return type. Then,
the Transform and FlatTransform cases collapse together, and we just need
Transform and TransformValues.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)