peter-toth commented on PR #39722: URL: https://github.com/apache/spark/pull/39722#issuecomment-1423802581
> > I would keep the canonicalization of a single commutative expression like: `Add(x, y)` as it is: `Add(x, y)`, but when we have at least 2 nested: `Add(Add(x, y), z)` then as `MultiCommutativeOp(x, y, z, Add)` > > If I understand it correctly, setting the threshold to `3` should achieve this requirement, right? You changed the config semantics from the number of commutative expressions to the number of operands in commutative expression tree (https://github.com/apache/spark/pull/39722/commits/e196e7d60d664af2502faab212477d24c2398d49), so now 3 is the value I too prefer. This is because the memory footprint of `Add(x, y)` seems smaller than `MultiCommutativeOp(x, y, Add)`, but this changes when have at least 2 `Add`s. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
