prashantwason opened a new issue, #18180:
URL: https://github.com/apache/hudi/issues/18180

   ## Description
   
   Currently, when the write config does not have correct enable flags for MDT 
(Metadata Table) partitions, they are automatically deleted. This does not work 
well for production datasets with multiple writers where accidental config 
changes can cause large MDT indexes to be deleted.
   
   ## Motivation
   
   In multi-writer production environments, a misconfigured writer could 
accidentally delete large metadata indexes that took significant time to build. 
This leads to:
   - Unexpected data loss of metadata indexes
   - Potential performance degradation as indexes need to be rebuilt
   - Difficult debugging when metadata is unexpectedly missing
   
   ## Proposed Solution
   
   Add a new configuration `hoodie.metadata.auto.delete.partitions` (default: 
`true` for backward compatibility) that controls whether metadata table 
partitions can be automatically deleted when the corresponding config flag is 
disabled.
   
   When set to `false`:
   - `maybeDeleteMetadataTable()` will not delete MDT when 
`hoodie.metadata.enable=false`
   - `deleteMetadataIndexIfNecessary()` will not delete MDT partitions when 
index configs are disabled
   - Users must explicitly drop metadata partitions using hudi-cli or DROP 
INDEX command
   
   ## Impact
   
   - New config: `hoodie.metadata.auto.delete.partitions` (default: `true`)
   - No breaking changes for existing users (default behavior unchanged)
   - Production users can opt-in to safer behavior by setting to `false`
   
   ## Related
   
   JIRA: HUDI-4966


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to