viirya commented on issue #23731: [SPARK-26572][SQL] fix aggregate codegen result evaluation URL: https://github.com/apache/spark/pull/23731#issuecomment-462683165 > @cloud-fan @viirya I am not sure about fixing this in the join is a good idea. First of all we have many kind of joins, so likely we would need to impact all of them and there may be other operators which use loops other than joins. I don't think it is correct to delegate to the consumer the responsibility of computing variables if needed. It seems more reasonable to me to fix it in the aggregate honestly. In whole-stage codegen, we have the optimization to defer variable evaluation as late as possible. An operator can avoid evaluating its output variables and let its parent operator to evaluate these variables if they are actually used. Unless we want to remove this optimization, I think we shouldn't force the evaluation in aggregate. @rednaxelafx's fix looks fine to me. Actually I'm wondering why we have such non deterministic expression pushed down to aggregate...
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
