LiebingYu commented on PR #1940: URL: https://github.com/apache/fluss/pull/1940#issuecomment-3495520551
There's another problem for remote log TTL expiration when data lake enable. Let's say: 1. User create a Fluss table with data lake enable. 2. User disable the data lake. The problem in step 2 is: We don't notify the Replica when disable the data lake. So remote log for this table will never been expired if user don't enable data lake again. I suggest: 1. Temporarily prohibit users from disabling the lake before we resolve 2. 2. We need to discuss how data TTL should be handled if the user disables the data lake. One option is to do nothing—in this case, if the user never re-enables the data lake, all historical data will remain in Fluss, which would obviously put significant pressure on Fluss’s storage. Another option is to notify the replica when the data lake is disabled, so that afterwards the data's TTL will no longer consider the progress of tiering to the lake; the user would need to accept the risk of losing lake data if they have set an aggressive TTL. I prefer the latter approach. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
