kbuci opened a new issue, #17908: URL: https://github.com/apache/hudi/issues/17908
### Task Description **What needs to be done:** - Implement https://issues.apache.org/jira/browse/HUDI-9407 so that users can specify using `OPTIMISTIC_CONCURRENCY_CONTROL` when creating metadata table writer `org.apache.hudi.metadata.HoodieMetadataWriteUtils#createMetadataWriteConfig` Add a new config that lists the table services supported for inline scheduling but async execution in metadata table. Initially we can choose to support just compaction and logcompaction. If enabled, - metadata table writer should still schedule all table service plans inline but not execute the specified types of table service plans inline. Note that this means it should also not re-try these plans in `org.apache.hudi.metadata.HoodieBackedTableMetadataWriter#runPendingTableServicesOperationsAndRefreshTimeline` - Another concurrent writer can initialize a metadata table writer (passing in OCC as concurrency type in the metadata write config) and execute these plans. **Why this task is needed:** For datasets with large metadata table partitions (like RECORD_INDEX) we cannot have metadata table compaction be executed during the write, as that will impact runtimes. Rather, we only schedule the plan inline, and have a separate platform that executes these plans. We need the above features to ensure that we can have specific table services on the metadata table be guaranteed to not execute inline during write. And instead, outside concurrent writers can safely execute these plans. This "async" execution will only take the table lock when transitioning the instants, in order to not block the writer job. We can upstream our implementations for above once we reach consensus. ### Task Type Code improvement/refactoring ### Related Issues **Parent feature issue:** (if applicable ) **Related issues:** NOTE: Use `Relationships` button to add parent/blocking issues after issue is created. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
