ahshahid commented on PR #37824:
URL: https://github.com/apache/spark/pull/37824#issuecomment-1241281980

   Though this PR fixes the bug, but performance benchmark is now 14 sec 
instead of 200ms. Which is not good.
   I also now appreciate much better the requirement of precanonicalize phase ( 
the cost of canonicalizing  an expression like
   a + b + c+ d + e + f   which is a nested tree of Add and for proper 
canonicalization needs to be flatenned hence recursive cost which 
precanonicalize avoids by having only 1 such deep call.
   I will try the alternate approach of ensuring hashCode is symmetric for 
commutative expressions which will mean minimal changes & fix the bug.
   The ugly part there will be specific handling of Seq[Expression] for the 
Least & Greatest.
   Once I modify this PR will elicit your inputs...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to