ganczarek commented on issue #4656: URL: https://github.com/apache/hudi/issues/4656#issuecomment-1152676222
@parisni No, I wasn't able to mitigate this. We worked around it by doing the following: 1) Disabling table metadata for all Hudi tables (`hoodie.metadata.enable=false`) noticeably speeds up read and write times. Also, performance doesn't seem to degrade over time as new partitions are added. 2) I redesigned our ETL so that we don't need a Hudi table with 300k partitions. Resulting table is smaller, differently partitioned, has reasonable amount of partitions and uses pure Parquet files (no Hudi) I keep tracking new releases of Hudi. Eventually, I plan to enable Hoodie metadata back. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
