kamalcph commented on PR #16653: URL: https://github.com/apache/kafka/pull/16653#issuecomment-2248331269
> I found an issue while testing this. When we disable the "remote.storage.enable" with "retain" policy, it's good we now can read data from the remote storage. But after the server restarted, the log start will be reset to the log segment baseOffset [here](https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogLoader.scala#L177-L180) because the remoteStorage is disabled. I can make it counting for "disable policy". But since default value of "disable policy" is "retain", so even if the remote Storage is always disabled, we will still treat all logs as "retained policy". We can't use the local log start/ log start check because we are checking new log start offset here. Thoughts? > > I'll think about it tomorrow. While reviewing this PR, I noticed that the current way of applying the topic config is kind of confusing and lead to lot of `if-else` cases. If we apply the config in below manner, then most of the issues will go away: 0. The default value of `remote.log.disable.policy` should be null 1. To gracefully disable the remote storage for a topic, the user should set the `remote.log.disable.policy` to `retain`, then we stop the copyTasks. The expiration and follower tasks will be running in the background. The consumer will be able to read the archived data from remote storage and the remote data gets deleted asynchronously. In this state, `remote.storage.enable` should still be set to true since we are reading the data from remote storage. 2. Once all the remote data gets deleted (ie after the retention time/size), the user can set the `remote.log.enable` to `false` to stop the respective RLM and RLMM resources for those topics. 3. If the user directly sets the `remote.log.enable` to `false`, then it should ungracefully delete all the remote segments similar to the `remote.log.disable.policy` as `delete`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org