[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016395#comment-17016395 ] Matthias J. Sax commented on KAFKA-7591: SGTM. > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016372#comment-17016372 ] Sophie Blee-Goldman commented on KAFKA-7591: That's fair, I suppose if it _was_ the window size that changed then all bets are off anyway, while users who just changed the retention period will be able to benefit. We should just change it and log that we did, maybe in the meantime including a warning about what is/isn't a compatible change. Tangentially, we should compile a list of compatible changes that can be made dynamically vs incompatible changes that require a reset, and document that somewhere. This seems like a common question/source of confusion > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016352#comment-17016352 ] Matthias J. Sax commented on KAFKA-7591: Personally, I don't see why updating the config and enabling this feature would make the situation worse and why we would need to block this feature? We can't detect the discussed case today, and hence, enabling this feature does not change the current "level of protection" (not better but also not worth), but would just make Kafka Streams more feature rich (ie, better). I don't see what we gain by logging a WARN if the expected config does not match the actual config: if the reason is indeed a change of the window size, the program is already starting up and if we just log a WARN and proceed, the state will get "corrupted", so we don't really gain anything from my point of view. > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016294#comment-17016294 ] Sophie Blee-Goldman commented on KAFKA-7591: Right, that is why I say if we can't distinguish between a window size change (incompatible) and a retention period change (fine) we should _not_ alter the configs, and just log a warning if there is a mismatch. Once KAFKA-8307 is fixed then we can consider allowing Streams to change existing topic configs. For now, I think it would still be valuable to check whether there are contradictory configs and raise it to the user if so. > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015631#comment-17015631 ] Matthias J. Sax commented on KAFKA-7591: Yes, it would be a semantic issue. The issue would resolve itself naturally though over time when windows are eventually discarded. New windows would be created correctly. However, the main argument is, that I don't think we can detect this case and it's the users responsibility to reset the application for this case – we can't really provide support for this atm – it's a more general issue tracked via https://issues.apache.org/jira/browse/KAFKA-8307 > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015502#comment-17015502 ] Matthias J. Sax commented on KAFKA-7591: {quote}Shouldn't you need to reset the app if the window size changes? {quote} Maybe, but this seems to be orthogonal. {quote}But it wouldn't be able to distinguish a change in retention from a change in window size, {quote} As above, seems to be orthogonal. I would still prefer if KS would automatically update the topic configuration. Note, that we do have an explicit `topic.` prefix to specify topic configs and it seem reasonable to allow user to change those configs and that KS updates the corresponding topic configuration (this also holds for `replication.factor` btw). But I think we can cover all this with a single PR – if a topic exists, we fetch it's config, compare it to whatever config we computed and issue AlterTopicConfig request to update the config. (We would also log an INFO statement when this happens). > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015433#comment-17015433 ] Jon Bates commented on KAFKA-7591: -- Agreed! A WARN message could at least be picked up, even if synchronizing the retention period isn't feasible > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015408#comment-17015408 ] Sophie Blee-Goldman commented on KAFKA-7591: Shouldn't you need to reset the app if the window size changes? On the other hand, it might be a good idea to at least verify the topic configs (for all internal topics), and log a warning or even throw an exception if they don't match. On a related note, users may want to increase the retention period to allow querying the state for longer – in that case it does seem reasonable for Streams to alter the changelog's retention. But it wouldn't be able to distinguish a change in retention from a change in window size, thus I think it's still better to just detect the discrepancy and alert the user so they can consider the best course of action (reset app or alter topic config) > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708021#comment-16708021 ] John Roesler commented on KAFKA-7591: - I looked into the code, and for reference, the reason the config doesn't get propagated is that in `org.apache.kafka.streams.processor.internals.InternalTopicManager#makeReady`, we first fetch the number of partitions for each internal topic in the topology. If the topic exists and has the right number of partitions, then we do nothing. There's no idempotent create operation, so we'd either have to idempotently "alter configs" on every topic or fetch the configs and only alter the ones that don't match the configs that come from the topology. I don't think it would be practical, without changing the code structure, to do this only for window stores, and it might be nice to have changes from other internal topic configs propagated. > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675529#comment-16675529 ] Matthias J. Sax commented on KAFKA-7591: I updated this a "improvement" as the describe behavior is by design. It's not expected that the window-size is changed. > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian JIRA (v7.6.3#76005)