Github user mswit-databricks commented on the issue: https://github.com/apache/spark/pull/21070 @rdblue Do you see any risk of additional overhead coming from the extra stats? For example, if the data contains very long strings, performing comparison on them to generate stats will be expensive and might not be worth it in certain scenarios.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org