High level requirements :

1. Write larger files while keeping the ingestion & query latencies low
2. Better data layout, for eg.,when rewriting smaller files to larger ones,
piggyback on the I/O and move records around and group them based on some
pattern for better query performance, compression etc..

Created an issue around this :
https://issues.apache.org/jira/browse/HUDI-112

Let's discuss there and then we can follow it up with a HIP.

Thanks,
Nishith

Reply via email to