zhuqi-lucas commented on issue #22682: URL: https://github.com/apache/datafusion/issues/22682#issuecomment-4593653671
Cross-link: opened #22701 proposing a generic `FallbackGroupColumn` approach as a complementary direction. The two issues address the same problem from different angles: - **This issue (#22682)**: add type-specific specializations (`List` / `LargeList` / `FixedSizeList` / `Struct` / `Map`). Each gives a fast column-wise + short-circuit path for that exact type. - **#22701**: add a generic fallback so any Arrow type goes through `GroupValuesColumn`, with the specializations from this issue (and the existing primitive / byte / boolean / decimal128 ones) layered on top as opt-in optimizations. Together they solve both: 1. The structural lock-in where a single unsupported column drags otherwise-fast-path-eligible columns onto `GroupValuesRows` (addressed by #22701). 2. The per-type performance ceiling once you are already on `GroupValuesColumn` (addressed by the specializations here). Happy to start with whichever direction maintainers prefer to land first. If #22701 lands, the specializations here become straight perf wins without affecting correctness fallback; if this issue lands first, #22701 becomes a follow-up that catches everything else. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
