Hi All, The existing SnapshotCopier under Hudi Utilities is a Hudi-to-Hudi copy and primarily for backup purpose.
I would like to start a RFC for a more generic Hudi snapshotter, which
- Supports existing SnapshotCopier features
- Add option to export a Hudi dataset to plain parquet files
- output latest records via Spark dataframe writer
- remove Hudi metadata fields
- support custom repartition requirements
Is this a good idea to start an RFC?
Thank you.
Regards,
Raymond Xu
