[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458456#comment-16458456 ] ASF GitHub Bot commented on KAFKA-6535: --- mjsax closed pull request #4730: KAFKA-6535: Set default retention ms for Streams repartition topics to Long.MAX_VALUE URL: https://github.com/apache/kafka/pull/4730 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/streams/upgrade-guide.html b/docs/streams/upgrade-guide.html index 565bd0b263c..462824fdcb4 100644 --- a/docs/streams/upgrade-guide.html +++ b/docs/streams/upgrade-guide.html @@ -100,6 +100,7 @@ Upgrade Guide and API Changes + Streams API changes in 1.2.0 We have removed the skippedDueToDeserializationError-rate and skippedDueToDeserializationError-total metrics. @@ -156,7 +157,10 @@ Streams API The new class To allows you to send records to all or specific downstream processors by name and to set the timestamp for the output record. Forwarding based on child index is not supported in the new API any longer. - + +https://cwiki.apache.org/confluence/x/DVyHB;>KIP-284 changed the retention time for repartition topics by setting its default value to Long.MAX_VALUE. +Instead of relying on data retention Kafka Streams uses the new purge data API to delete consumed data from those topics and to keep used storage small now. + Kafka Streams DSL for Scala is a new Kafka Streams client library available for developers authoring Kafka Streams applications in Scala. It wraps core Kafka Streams DSL types to make it easier to call when interoperating with Scala code. For example, it includes higher order functions as parameters for transformations avoiding the need anonymous classes in Java 7 or experimental SAM type conversions in Scala 2.11, automatic conversion between Java and Scala collection types, a way diff --git a/docs/upgrade.html b/docs/upgrade.html index 08cc892d24c..4fe7e20794e 100644 --- a/docs/upgrade.html +++ b/docs/upgrade.html @@ -75,6 +75,7 @@ Notable changes in 1 updated to aggregate across different versions. New Kafka Streams configuration parameter upgrade.from added that allows rolling bounce upgrade from older version. +https://cwiki.apache.org/confluence/x/DVyHB;>KIP-284 changed the retention time for repartition topics by setting its default value to Long.MAX_VALUE. New Protocol Versions @@ -87,7 +88,6 @@ Upgrading a 1.2.0 Ka See Streams API changes in 1.2.0 for more details. - Upgrading from 0.8.x, 0.9.x, 0.10.0.x, 0.10.1.x, 0.10.2.x, 0.11.0.x or 1.0.x to 1.1.x Kafka 1.1.0 introduces wire protocol changes. By following the recommended rolling upgrade plan below, you guarantee no downtime during the upgrade. However, please review the notable changes in 1.1.0 before upgrading. @@ -132,6 +132,7 @@ Upgrading from 0.8.x, 0.9.x, 0.1 Hot-swaping the jar-file only might not work. + > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16435307#comment-16435307 ] Khaireddine Rezgui commented on KAFKA-6535: --- Hi guys, i updated the [https://cwiki.apache.org/confluence/display/KAFKA/KIP-284%3A+Set+default+retention+ms+for+Streams+repartition+topics+to+Long.MAX_VALUE|KIP-284] as accepted, can we proceed to the merge ? > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429723#comment-16429723 ] Khaireddine Rezgui commented on KAFKA-6535: --- Thank you, i got the permission, can you take a look in the kip, and return some feedback :) ? [https://cwiki.apache.org/confluence/display/KAFKA/KIP-284%3A+Set+default+retention+ms+for+Streams+repartition+topics+to+Long.MAX_VALUE] Thanks, > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429509#comment-16429509 ] Matthias J. Sax commented on KAFKA-6535: [~Khairy] Yes it does. What is your wiki ID so we can grant you permission? > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429352#comment-16429352 ] Khaireddine Rezgui commented on KAFKA-6535: --- hi [~vvcephei], i have access to confluence wiki, but i haven't create menu, is it requere permision ? > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424263#comment-16424263 ] Khaireddine Rezgui commented on KAFKA-6535: --- Thank you [~vvcephei], i will take a look in the links > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422538#comment-16422538 ] John Roesler commented on KAFKA-6535: - Oh, if you're not subscribed to the dev mailing list, you should also do that. That's the forum for the [DISCUSS] and the [VOTE] threads. > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422537#comment-16422537 ] John Roesler commented on KAFKA-6535: - Hey [~Khairy], There's not much to it; the instructions here should get you started: [https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals] For reference, here's one I'm working on right now: [https://cwiki.apache.org/confluence/display/KAFKA/KIP-274%3A+Kafka+Streams+Skipped+Records+Metrics] Beyond the KIP itself, you also have to run a mailing list discussion, followed by a mailing list vote. The details are in this section: [https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals#KafkaImprovementProposals-Process.] Feel free to ask for clarification at any time! -John > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421981#comment-16421981 ] Khaireddine Rezgui commented on KAFKA-6535: --- Hi guys, can someone help me to create the KIP, it's my first :) > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410534#comment-16410534 ] Guozhang Wang commented on KAFKA-6535: -- I'd vote to still start a KIP though it is a very small one, as it is still a public change and for people who're relying on the default retention it would be less of a surprise. > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410509#comment-16410509 ] John Roesler commented on KAFKA-6535: - It seems unlikely that there might be folks who just use the default and are adversely affected by the new default, so I'd "vote" not to bother with a KIP. But I'm also new in town... > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410082#comment-16410082 ] Matthias J. Sax commented on KAFKA-6535: This ticket is labels as "need kip" -- not sure about this. Thoughts? > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Khaireddine Rezgui >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. > More specifically, in {{RepartitionTopicConfig}} we have a few default > overrides for repartition topic configs, we should just add the override for > {{TopicConfig.RETENTION_MS_CONFIG}} to set it to Long.MAX_VALUE. This still > allows users to override themselves if they want via > {{StreamsConfig.TOPIC_PREFIX}}. We need to add unit test to verify this > update takes effect. > In addition to the code change, we also need to have doc changes in > streams/upgrade_guide.html specifying this default value change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6535) Set default retention ms for Streams repartition topics to Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16395884#comment-16395884 ] Guozhang Wang commented on KAFKA-6535: -- Makes sense, I've changed the title accordingly. And also I've updated the description for anyone to pick them up more easily. > Set default retention ms for Streams repartition topics to Long.MAX_VALUE > - > > Key: KAFKA-6535 > URL: https://issues.apache.org/jira/browse/KAFKA-6535 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Priority: Major > Labels: needs-kip, newbie > > After KIP-220 / KIP-204, repartition topics in Streams are transient, so it > is better to set its default retention to infinity to allow any records be > pushed to it with old timestamps (think: bootstrapping, re-processing) and > just rely on the purging API to keeping its storage small. -- This message was sent by Atlassian JIRA (v7.6.3#76005)