viirya commented on issue #23731: [SPARK-26572][SQL] fix aggregate codegen 
result evaluation
URL: https://github.com/apache/spark/pull/23731#issuecomment-462683165
 
 
   > @cloud-fan @viirya I am not sure about fixing this in the join is a good 
idea. First of all we have many kind of joins, so likely we would need to 
impact all of them and there may be other operators which use loops other than 
joins. I don't think it is correct to delegate to the consumer the 
responsibility of computing variables if needed. It seems more reasonable to me 
to fix it in the aggregate honestly.
   
   In whole-stage codegen, we have the optimization to defer variable 
evaluation as late as possible. An operator can avoid evaluating its output 
variables and let its parent operator to evaluate these variables if they are 
actually used. Unless we want to remove this optimization, I think we shouldn't 
force the evaluation in aggregate.
   
   @rednaxelafx's fix looks fine to me. Actually I'm wondering why we have such 
non deterministic expression pushed down to aggregate...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to