stream2000 commented on PR #8062: URL: https://github.com/apache/hudi/pull/8062#issuecomment-1569696943
@nsivabalan @SteNicholas Hi, thanks for your review. I have updated the RFC according to your comments. 1. Simplify the TTL policy, now we will only support `KEEP_BY_TIME` so we don't need to compare partition values between partitions. 2. Remove the persistent JSON stats mechanism. We can gather partition stats every time we do TTL management. 3. Add more detail about executing TTL management. We will support async table services for TTL management and both spark and flink engine can execute the async service. 4. Consider Record level TTL policy when design Pattition TTL Policy. And here are the answer to your remaining question. > Why store the policy in hoodie.properties instead of using write config? I think we need to store the policy in hudi somewhere otherwise user needs to store the policy themselves. Think when the user has 1000 partitions and wants to set TTL policies for 500 of them. It's better to store the policy metadata in hudi. Hope for your another round of review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
