stream2000 commented on PR #8062:
URL: https://github.com/apache/hudi/pull/8062#issuecomment-1569696943

   @nsivabalan @SteNicholas Hi, thanks for your review. I have updated the RFC 
according to your comments. 
   1. Simplify the TTL policy, now we will only support `KEEP_BY_TIME` so we 
don't need to compare partition values between partitions. 
   2. Remove the persistent JSON stats mechanism. We can gather partition stats 
every time we do TTL management. 
   3. Add more detail about executing TTL management. We will support async 
table services for TTL management and both spark and flink engine can execute 
the async service. 
   4. Consider Record level TTL policy when design Pattition TTL Policy.
   
   And here are the answer to your remaining question. 
   > Why store the policy in hoodie.properties instead of using write config? 
   I think we need to store the policy in hudi somewhere otherwise user needs 
to store the policy themselves. Think when the user has 1000 partitions and 
wants to set TTL policies for 500 of them. It's better to store the policy 
metadata in hudi. 
   
   Hope for your another round of review! 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to