kamalcph commented on PR #16653:
URL: https://github.com/apache/kafka/pull/16653#issuecomment-2248331269

   > I found an issue while testing this. When we disable the 
"remote.storage.enable" with "retain" policy, it's good we now can read data 
from the remote storage. But after the server restarted, the log start will be 
reset to the log segment baseOffset 
[here](https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogLoader.scala#L177-L180)
 because the remoteStorage is disabled. I can make it counting for "disable 
policy". But since default value of "disable policy" is "retain", so even if 
the remote Storage is always disabled, we will still treat all logs as 
"retained policy". We can't use the local log start/ log start check because we 
are checking new log start offset here. Thoughts?
   > 
   > I'll think about it tomorrow.
   
   While reviewing this PR, I noticed that the current way of applying the 
topic config is kind of confusing and lead to lot of `if-else` cases. If we 
apply the config in below manner, then most of the issues will go away: 
   
   0. The default value of `remote.log.disable.policy` should be null
   1. To gracefully disable the remote storage for a topic, the user should set 
the `remote.log.disable.policy` to `retain`, then we stop the copyTasks. The 
expiration and follower tasks will be running in the background. The consumer 
will be able to read the archived data from remote storage and the remote data 
gets deleted asynchronously. In this state, `remote.storage.enable` should 
still be set to true since we are reading the data from remote storage.
   2. Once all the remote data gets deleted (ie after the retention time/size), 
the user can set the `remote.log.enable` to `false` to stop the respective RLM 
and RLMM resources for those topics. 
   3. If the user directly sets the `remote.log.enable` to `false`, then it 
should ungracefully delete all the remote segments similar to the 
`remote.log.disable.policy` as `delete`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to