VitoMakarevich commented on issue #10878: URL: https://github.com/apache/hudi/issues/10878#issuecomment-2010098668
I also checked on some test tables that manually changing property in `hoodie.properties` works, but then you must ensure that clustering touches all files - which may be problematic if you have a lot of partitions. Since it creates Npartitions * (clustering group per partition) - so in our case, it will be submitting ~5k spark jobs which 99% will blow up the driver. And there is no yet way to limit this number if you want to run clustering on all partitions(I see in newer Hudi versions there is executor service which handles this). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
