nsivabalan commented on issue #3975: URL: https://github.com/apache/hudi/issues/3975#issuecomment-997294422
@dmenin : If you are up for issuing two separate operations (delete followed by update), I might have a suggestion. How is your updates/deletes spread in general? Is it totally random spreading across all partitions and file groups or has any affinity towards few partitions. May be you can try using BLOOM index for deletes and see how that goes. If its not randomly spread out, this will help in reducing index look up. Also, you can disable small file handling for your delete operation. https://hudi.apache.org/docs/configurations/#hoodieparquetsmallfilelimit = 0. Let us know how your MOR exploration is going as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
