[jira] [Commented] (KAFKA-3576) Unify KStream and KTable API

Matthias J. Sax (JIRA) Mon, 13 Jun 2016 13:59:03 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15328242#comment-15328242
 ]


Matthias J. Sax commented on KAFKA-3576:
----------------------------------------

I see two points:
(1) this is similar to SQL / Pig or Spark DSL (this was also the motivation for 
[KAFKA-3337])
(2) user ofter forgot the {{through()}} in 
{{stream.selectKey(...).through(...).aggregateByKey(...)}} which is a 
no-intuitive operation
(2a) even if [KAFKA-3561] tackles the {{through}} problem, an explicit 
{{groupBy}} makes the re-distribution overhead explicit

> Unify KStream and KTable API
> ----------------------------
>
>                 Key: KAFKA-3576
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3576
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: streams
>            Reporter: Matthias J. Sax
>            Assignee: Guozhang Wang
>              Labels: api
>             Fix For: 0.10.1.0
>
>
> For KTable aggregations, it has a pattern of 
> {{table.groupBy(...).aggregate(...)}}, and the data is repartitioned in an 
> inner topic based on the selected key in {{groupBy(...)}}.
> For KStream aggregations, though, it has a pattern of 
> {{stream.selectKey(...).through(...).aggregateByKey(...)}}. In other words, 
> users need to manually use a topic to repartition data, and the syntax is a 
> bit different with KTable as well.
> h2. Goal
> To have similar APIs for aggregations of KStream and KTable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3576) Unify KStream and KTable API

Reply via email to