I have been playing with the starrocks MOR hudi reader recently and it does an amazing work: it has two read paths:
1. For partitions with log files, use the merging logic 2. For partitions with only parquet files, use the cow read logic As you know, the first path is slow bcoz it has merging overhead and can't provide any parquet benefit (pushdown, blooms...). In contrast, the second path is blazing fast. MOR comes with tons of compaction rules, and having such behavior makes possible hot/cold partition management. One particular case is GDPR where usually old records are deleted/masked on a random distribution , while new partitions are free of changes. So far spark does not make distinction between log / log free partitions and I suspect adding such improvement would make MOR table more performant. I would be glad to work on such feature so please give early feedback if there is some blocker.