Hey Jalpesh, Thanks for taking the initiative to propose this feature!
This is an exciting direction and aligns well with modern trends in leveraging DataFrame APIs for better performance. It’s great to see how this can improve execution efficiency while also expanding Hudi's ecosystem compatibility. Looking forward to reviewing the RFC and contributing to the discussion. Regards, Sagar On Fri, Jan 24, 2025 at 6:52 PM Jalpesh Borad <jalpeshbo...@gmail.com> wrote: > Hey Team, > I have created a PR for RFC claim for DataFrame implementation for HUDI > write path. > > Background: > > Today, HUDI write path is majorly implemented using RDD APIs for Apache > Spark and RDD APIs are not supported by accelerators like Spark-Rapids, > Velox-Gluten etc. > Having a DataFrame implementation will also speed-up overall execution > compared to RDD computations. > > This RFC will help in designing new writer paths which will rely on > DataFrame APIs rather than using RDDs. > > -- > Thanks and Regards, > Jalpesh Borad > +91 8140500542 >