Kimahriman commented on pull request #32586:
URL: https://github.com/apache/spark/pull/32586#issuecomment-856770297


   Tested this out yesterday and ran into issues occasionally with
   ```
   java.lang.IllegalArgumentException: Comparison method violates its general 
contract!
   ```
   
   I guess even though we don't care about the order of non-overlapping 
subexpressions, the comparator still needs to satisfy certain properties for 
the sort to work: 
https://docs.oracle.com/javase/8/docs/api/java/util/Comparator.html#compare-T-T-
   
   Would it just make sense to sort based on number of nodes in the tree? I 
would think a subexpression could only contain another subexpression if it has 
more expressions in the tree, not sure if there are any weird cases that's not 
true. Alternatively would only returning 1 or -1 for 
`x.find(_.semanticEquals(y)).isDefined` and 
`y.find(_.semanticEquals(x)).isDefined` and 0 otherwise fix it? Not sure if 
that's still properly transitive and other necessary comparator properties


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to