Hi Ross, Thanks for the feedback!
For some "context," the change you mention was: https://cwiki.apache.org/confluence/display/KAFKA/KIP-478+-+Strongly+typed+Processor+API In addition to the spec on that page, there are links to the discussion and voting mailing list threads. The primary motivation was to go from a prior state in which there was absolutely no safety or checking possible regarding which types a parent should pass to a child or a child should accept from the parent. This was a practical choice: there have been several subtle bugs in Streams DSL features in the last several releases, in which some edge case causes the feature's processors to forward a wrongly- typed record downstream. I was trying pretty hard actually _not_ to design a whole new "V2" of the PAPI, since I really wanted to fix a design flaw that had cost us hundreds of hours of effort, both as maintainers and as users, and I wanted to fix it as fast as possible. Sadly, even though I was trying to be as incremental as possible, it still took a very long time to get from proposal to implementation, and unfortunately, the process of migrating the DSL's internal code is still not complete. This seems to suggest that it was actually a good choice not to bring in even more of a grand vision here, as there would have been even more delays. Ok, that's enough self-justification ;) Your appraisal is correct. The new PAPI typing does not handle the case of child nodes of different types, and if that's your situation, you will indeed have to set the outbound types to a supertype of all children's inbound types (such as Object). I like the spirit of your proposal, and I agree it would be good to move in a direction of having good type safety for your use case. If we can continue to evolve the API via additions and deprecations, it will be less disruptive for the overall community of users, but indeed for some kinds of changes, a clean break is better. It looks like your specific proposal could be achieved in place, but it also seems like you have a pretty clear idea of where you'd like to see the API go in the future. Actually we have the 3.0 release coming up, so it would be a favorable time for you to promote your vision. To help everyone discuss both large and small proposals unambiguously, we have a formal process for design reviews, called KIPs: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals If you do want to move forward with a proposal, please feel free to start drafting a KIP. If you need any help navigating that process, please reach out to us on the dev mailing list, or of course, we can continue to use this thread. I really appreciate your willingness to share your feedback and ideas. Thank you, John On Mon, 2021-03-22 at 14:46 +1100, Ross Black wrote: > Hi, > > I wanted to provide some feedback about the new API introduced in Kafka > streaming 2.7 for adding a Processor to the Topology (using PAPI). > > Prior to 2.7, the Processor was typed only by the input key values <K,V>, > and the ProcessorContext was untyped. > > In 2.7 ProcessorSupplier & Processor now have the additional params <..., > KForward, VForward> and the ProcessorContext also has <KForward, > VForward>. The *only* use of this appears to be to provide some typing for > the "forward" functions. > One of my common use cases is where I have multiple child nodes each of > which has different output <K,V>. To migrate to the new 2.7 API, seems to > require adding <.., Object, Object> parameters to every type which looks > noisy in the code and does nothing for improving type safety of the API. > > I would like to ask why the new type parameters were added when they do not > actually achieve typing for the multiple child use-case? > > Personally, I would prefer new methods on the ProcessorContext that allow > me to create a "typed" forwarder per child node. > > e.g. > > public interface ProcessorContext { > > /** Default forwarder. */ > <K, V> Forwarder<K, V> forwarder(); > > /** Forwarder for a specific child. */ > <K, V> Forwarder<K, V> forwarder(final String childName); > > } > > > > public interface Forwarder<K, V> { > void forward(Record<K, V> record); > } > > > With my suggested approach you at least get some typing per-topic (child > node). To me this also feels like it better represents the domain model of > single stream input with multiple outputs. Forwarder also feels like a > real entity in the model (similar to Producer), rather than just being a > method hacked into the context. > > > To me, the new 2.7 API feels like a step backwards. I also get the > impression that the streaming API continues to break each release without a > particular strong model / design behind it (particularly the sometimes > jarring disconnect between PAPI and streaming API). > Having learned from initial implementations, is it time to take a step back > to design a "V2" streaming API which blends the existing PAPI + streaming > API? > I am curious what others think? > > Thanks, > Ross