[ 
https://issues.apache.org/jira/browse/KAFKA-6150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226082#comment-16226082
 ] 

Guozhang Wang commented on KAFKA-6150:
--------------------------------------

A couple of issues to consider here: 

1. Restoration: repartition topics are still used as input topics for a task 
(i.e. a sub-topology) when initializing the task to restore state. In this 
scenario we should not restore from repartition topics anymore but from source 
topics / changelog topics only. This means we should not reuse the repartition 
topics as changelog topics.

2. Reset Tooling: today when we reset the application state, we will delete the 
intermediate topic. Purging data and the topic deletion should not be 
conflicting, but worth double checking on the admin client to make sure no 
async processing could result in race conditions.

3. Exactly-Once: when committing, we will write the offset as the currently 
consumed record's position + 1, indicating the next record to resume from. So 
deleting up to the currently consumer record's position should be fine.

> Make Repartition Topics Transient
> ---------------------------------
>
>                 Key: KAFKA-6150
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6150
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Guozhang Wang
>            Assignee: Guozhang Wang
>              Labels: operability
>
> Unlike changelog topics, the repartition topics could just be short-lived. 
> Today users have different ways to configure them with short retention such 
> as enforce a short retention period or use AppendTime for repartition topics. 
> All these would be cumbersome and Streams should just do this for the users.
> One way to do it is use the “purgeData” admin API (KIP-107) such that after 
> the offset of the input topics are committed, if the input topics are 
> actually repartition topics, we would purge the data immediately.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to