zyclove commented on issue #10131: URL: https://github.com/apache/hudi/issues/10131#issuecomment-1818731721
@ad1happy2go Can bulk mode not generate small files? Directly output the 128M result file and merge it later. If hoodie.clustering is turned on, can small files be automatically merged after the bulk is completed? Must I start the follow job to do the merge? ``` hoodie.clustering.inline=true spark-submit \ --master yarn \ --class org.apache.hudi.utilities.HoodieClusteringJob \ hdfs://nameservice1/utility_jars/hudi-utilities-bundle_2.12-0.10.0.jar ``` ----------------------- If not use bulk mode. Can this stage(Building workload profile:smart_datapoint_report_rw_clear_rt )be optimized in hudi 1.0? This stage is simply too time consuming.  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
