[ 
https://issues.apache.org/jira/browse/KAFKA-7658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-7658:
---------------------------------
    Description: 
We'd like to add a new API to the KStream object of the Streams DSL:

{code}
KTable KStream.toTable()

KTable KStream.toTable(Materialized)
{code}

The function re-interpret the event stream {{KStream}} as a changelog stream 
{{KTable}}. Note that this should NOT be treated as a syntax-sugar as a dummy 
{{KStream.reduce()}} function which always take the new value, as it has the 
following difference: 

1) an aggregation operator of {{KStream}} is for aggregating a event stream 
into an evolving table, which will drop null-values from the input event 
stream; whereas a {{toTable}} function will completely change the semantics of 
the input stream from event stream to changelog stream, and null-values will 
still be serialized, and if the resulted bytes are also null they will be 
interpreted as "deletes" to the materialized KTable (i.e. tombstones in the 
changelog stream).

2) the aggregation result {{KTable}} will always be materialized, whereas 
{{toTable}} resulted KTable may only be materialized if the overloaded function 
with Materialized is used (and if optimization is turned on it may still be 
only logically materialized if the queryable name is not set).

Therefore, for users who want to take a event stream into a changelog stream 
(no matter why they cannot read from the source topic as a changelog stream 
{{KTable}} at the beginning), they should be using this new API instead of the 
dummy reduction function.

> Add KStream#toTable to the Streams DSL
> --------------------------------------
>
>                 Key: KAFKA-7658
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7658
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Guozhang Wang
>            Priority: Major
>              Labels: needs-kip, newbie
>
> We'd like to add a new API to the KStream object of the Streams DSL:
> {code}
> KTable KStream.toTable()
> KTable KStream.toTable(Materialized)
> {code}
> The function re-interpret the event stream {{KStream}} as a changelog stream 
> {{KTable}}. Note that this should NOT be treated as a syntax-sugar as a dummy 
> {{KStream.reduce()}} function which always take the new value, as it has the 
> following difference: 
> 1) an aggregation operator of {{KStream}} is for aggregating a event stream 
> into an evolving table, which will drop null-values from the input event 
> stream; whereas a {{toTable}} function will completely change the semantics 
> of the input stream from event stream to changelog stream, and null-values 
> will still be serialized, and if the resulted bytes are also null they will 
> be interpreted as "deletes" to the materialized KTable (i.e. tombstones in 
> the changelog stream).
> 2) the aggregation result {{KTable}} will always be materialized, whereas 
> {{toTable}} resulted KTable may only be materialized if the overloaded 
> function with Materialized is used (and if optimization is turned on it may 
> still be only logically materialized if the queryable name is not set).
> Therefore, for users who want to take a event stream into a changelog stream 
> (no matter why they cannot read from the source topic as a changelog stream 
> {{KTable}} at the beginning), they should be using this new API instead of 
> the dummy reduction function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to