Definitely can't see a benefit to use 30MB row groups over just creating 30MB
parquet files.
I would add that stats indexes are on the file level, so it's in favor to using
row groups size=file size.
The only context it would help is when clustering is setup and targets 1GB
files, w/ 128MB
HI,
I'm writing to express my interest in contributing to the Apache Hudi
project. I have been following the project's progress and find it both
exciting and challenging. I believe my skills and experience align well
with the project's objectives, and I am eager to contribute to its
development