Geethapranay1 commented on issue #4131: URL: https://github.com/apache/datafusion-comet/issues/4131#issuecomment-4417990035
i don’t think this is only a Spark vs DF hashing issue. For unordered FIRST/LAST, the intermediate state is just [value, is_set] once the partial hash aggregate emits that state, the original row order is lost. After that, PartialMerge becomes dependent on hash-table emission order, so changing the hasher alone would not make it correct I added a native-side guard to reject FIRST/LAST in PartialMerge before MergeAsPartial, and a unit test for that planner path. This keeps the existing fallback and makes the protection explicit on the native side as well can i open pr for this? A full native fix would need order-preserving merge semantics or richer intermediate state. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
