Re: Discuss fast copy on write rfc-68

2023-07-21 Thread Nicolas Paris
Definitely can't see a benefit to use 30MB row groups over just creating 30MB parquet files. I would add that stats indexes are on the file level, so it's in favor to using row groups size=file size. The only context it would help is when clustering is setup and targets 1GB files, w/ 128MB

I would like to be added as a Contributor

2023-07-21 Thread Vijayasarathi B
HI, I'm writing to express my interest in contributing to the Apache Hudi project. I have been following the project's progress and find it both exciting and challenging. I believe my skills and experience align well with the project's objectives, and I am eager to contribute to its development