Github user sujith71955 commented on the issue: https://github.com/apache/spark/pull/22758 @cloud-fan @HyukjinKwon @srowen As result of my above observations a) I am having some doubt like if we are expecting the stats shall estimate the data size with files then why in the insert flow there is a statement for updating the HiveStats? b) If we have mechanism to read the stats from hive then why we shall estimate the data size with files? Please let me know your suggestions i feel there is an inconsistency in this flow
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org