Kimahriman commented on pull request #32586: URL: https://github.com/apache/spark/pull/32586#issuecomment-856770297
Tested this out yesterday and ran into issues occasionally with ``` java.lang.IllegalArgumentException: Comparison method violates its general contract! ``` I guess even though we don't care about the order of non-overlapping subexpressions, the comparator still needs to satisfy certain properties for the sort to work: https://docs.oracle.com/javase/8/docs/api/java/util/Comparator.html#compare-T-T- Would it just make sense to sort based on number of nodes in the tree? I would think a subexpression could only contain another subexpression if it has more expressions in the tree, not sure if there are any weird cases that's not true. Alternatively would only returning 1 or -1 for `x.find(_.semanticEquals(y)).isDefined` and `y.find(_.semanticEquals(x)).isDefined` and 0 otherwise fix it? Not sure if that's still properly transitive and other necessary comparator properties -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
