danny0405 commented on issue #8071: URL: https://github.com/apache/hudi/issues/8071#issuecomment-1592228207
> partitioning strategies For partitioning strategies, are you referring to the data buckets (or FileGroup in Hudi's notion) or just regular directories like Hive, if it is former, here is the doc about Hudi file layput: https://hudi.apache.org/docs/file_layouts. For cow table, each update of the existing file group would trigger a re-write of the whole parquet file, so there is somehow large write amplification. For mor table, Hudi just prepend the new records first, and the data compation task was defered as an async execution. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
