Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16098 )
Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans ...................................................................... Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/16098/10/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/16098/10/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1179 PS10, Line 1179: // If all partitions have good stats, return the total row count, contributed : // by all of them, as the row count for the table. > So to summarize, the goal here (or at least original intention of this JIRA) > is that when you have a partitioned table, and the total row count of the > table is calculated to be 0, but some of the partitions are "corrupt", then > treat the stats as "missing", in which case we fall back to the estimation > based on the total table size + average row size? Yeah that's right. I hadn't really thought about unpartitioned tables, but it would probably make sense to do something similar there. I think internally we represent an unpartitioned table as a table with a single partition, so maybe they don't need to be handled separately in the code. > My vote is to follow the "treat corrupt partition stats the same as missing > partition stats" as that is more generic and should simpler and safer to > implement. WFM -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 10 Gerrit-Owner: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Tue, 30 Jun 2020 05:18:44 +0000 Gerrit-HasComments: Yes