[
https://issues.apache.org/jira/browse/KAFKA-15267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Christo Lolov updated KAFKA-15267:
----------------------------------
Labels: tiered-storage (was: )
> Cluster-wide disablement of Tiered Storage
> ------------------------------------------
>
> Key: KAFKA-15267
> URL: https://issues.apache.org/jira/browse/KAFKA-15267
> Project: Kafka
> Issue Type: Sub-task
> Reporter: Christo Lolov
> Assignee: Christo Lolov
> Priority: Major
> Labels: tiered-storage
>
> h2. Summary
> KIP-405 defines the configuration {{remote.log.storage.system.enable}} which
> controls whether all resources needed for Tiered Storage to function are
> instantiated properly in Kafka. However, the interaction between remote data
> and Kafka if that configuration is set to false while there are still topics
> with {{{}remote.storage.enable is undefined{}}}. {color:#ff8b00}*We would
> like to give customers the ability to switch off Tiered Storage on a cluster
> level and as such would need to define the behaviour.*{color}
> {{remote.log.storage.system.enable}} is a read-only configuration. This means
> that it can only be changed by *modifying the server.properties* and
> restarting brokers. As such, the {*}validity of values contained in it is
> only checked at broker startup{*}.
> This JIRA proposes a few behaviours and a recommendation on a way forward.
> h2. Option 1: Change nothing
> Pros:
> * No operation.
> Cons:
> * We do not solve the problem of moving back to older (or newer) Kafka
> versions not supporting TS.
> h2. Option 2: Remove the configuration, enable Tiered Storage on a cluster
> level and do not allow it to be disabled
> Always instantiate all resources for tiered storage. If no special ones are
> selected use the default ones which come with Kafka.
> Pros:
> * We solve the problem for moving between versions not allowing TS to be
> disabled.
> Cons:
> * We do not solve the problem of moving back to older (or newer) Kafka
> versions not supporting TS.
> * We haven’t quantified how much computer resources (CPU, memory) idle TS
> components occupy.
> * TS is a feature not required for running Kafka. As such, while it is still
> under development we shouldn’t put it on the critical path of starting a
> broker. In this way, a stray memory leak won’t impact anything on the
> critical path of a broker.
> * We are potentially swapping one problem for another. How does TS behave if
> one decides to swap the TS plugin classes when data has already been written?
> h2. Option 3: Hide topics with tiering enabled
> Customers cannot interact with topics which have tiering enabled. They cannot
> create new topics with the same names. Retention (and compaction?) do not
> take effect on files already in local storage.
> Pros:
> * We do not force data-deletion.
> Cons:
> * This will be quite involved - the controller will need to know when a
> broker’s server.properties have been altered; the broker will need to not
> proceed to delete logs it is not the leader or follower for.
> h2. {color:#00875a}Option 4: Do not start the broker if there are topics with
> tiering enabled{color} - Recommended
> This option has 2 different sub-options. The first one is that TS cannot be
> disabled on cluster-level if there are *any* tiering topics - in other words
> all tiered topics need to be deleted. The second one is that TS cannot be
> disabled on a cluster-level if there are *any* topics with *tiering enabled*
> - they can have tiering disabled, but with a retention policy set to delete
> or retain (as per
> [KIP-950|https://cwiki.apache.org/confluence/display/KAFKA/KIP-950%3A++Tiered+Storage+Disablement]).
> A topic can have tiering disabled and remain on the cluster as long as there
> is no *remote* data when TS is disabled cluster-wide.
> Pros:
> * We force the customer to be very explicit in disabling tiering of topics
> prior to disabling TS on the whole cluster.
> Cons:
> * You have to make certain that all data in remote is deleted (just a
> disablement of tired topic is not enough). How do you determine whether all
> remote has expired if policy is retain? If retain policy in KIP-950 knows
> that there is data in remote then this should also be able to figure it out.
> The common denominator is that there needs to be no *remote* data at the
> point of disabling TS. As such, the most straightforward option is to refuse
> to start brokers if there are topics with the {{remote.storage.enabled}}
> present. This in essence requires customers to clean any tiered topics before
> switching off TS, which is a fair ask. Should we wish to revise this later it
> should be possible.
> h2. Option 5: Make Kafka forget about all remote information
> Pros:
> * Clean cut
> Cons:
> * Data is lost the moment TS is disabled regardless of whether it is
> reenabled later on, which might not be the behaviour expected by customers.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)