[ 
https://issues.apache.org/jira/browse/KAFKA-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566419#comment-17566419
 ] 

Guozhang Wang commented on KAFKA-14069:
---------------------------------------

Hello [~ebrard], thanks for bringing this ticket up.

These internal topics are considered as repartition topics and Kafka Streams 
uses the admin client's `DeleteRecords` API to periodically truncate them after 
read, so these topics should not grow indefinitely. Did you observe such delete 
records request never being issued (which indicates a bug)? Or do you observe 
the delete-records rate cannot catch up with the append rate (for this case, 
you can consider configuring "repartition.purge.interval.ms")?

> Allow custom configuration of foreign key join internal topics
> --------------------------------------------------------------
>
>                 Key: KAFKA-14069
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14069
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Emmanuel Brard
>            Priority: Minor
>
> Internal topic supporting foreign key joins (-subscription-registration-topic 
> and -subscription-response-topic) are automatically created with_ infinite 
> retention_ (retention.ms=-1, retention.bytes=-1).
> As far as I understand those topics are used for communication between tasks 
> that are involved in the FK, the intermediate result though is persisted in a 
> compacted topic (-subscription-store-changelog).
> This means, if I understood right, that during normal operation of the stream 
> application, once a message is read from the registration/subscription topic, 
> it will not be read again, even in case of recovery (the position in those 
> topics is committed).
> Because we have very large tables being joined this way with very high 
> changes frequency, we end up with FK internal topics in the order of 1 or 2 
> TB. This is complicated to maintain especially in term of disk space.
> I was wondering if:
> - this infinite retention is really a required configuration and if not
> - this infinite retention could be replaced with a configurable one (for 
> example of 1 week, meaning that I accept that in case of failure I must this 
> my app within one week)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to