Marcel Kornacker has posted comments on this change. Change subject: IMPALA-2373: Extrapolate row counts for HDFS tables. ......................................................................
Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/6840/1/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: Line 494: 9: optional i64 total_hdfs_bytes why is this a parameter/an input of compute stats? http://gerrit.cloudera.org:8080/#/c/6840/1/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: Line 492: Preconditions.checkState(this instanceof HdfsTable); why have this function live here and not in hdfstable? http://gerrit.cloudera.org:8080/#/c/6840/1/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test: Line 114: stats-rows=7300 extrapolated-rows=7300 to reduce verbosity, print the extrapolated count only when it differs from stats-rows? http://gerrit.cloudera.org:8080/#/c/6840/1/testdata/workloads/functional-query/queries/QueryTest/alter-table.test File testdata/workloads/functional-query/queries/QueryTest/alter-table.test: Line 641: YEAR, MONTH, #ROWS, EXTRAP #ROWS, #FILES, SIZE, BYTES CACHED, CACHE REPLICATION, FORMAT, INCREMENTAL STATS, LOCATION extrap is a bit weird, and we don't use abbreviations elsewhere here. spell out? http://gerrit.cloudera.org:8080/#/c/6840/1/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test File testdata/workloads/functional-query/queries/QueryTest/compute-stats.test: Line 20: '2009','1',310,305,1,'24.56KB','NOT CACHED','NOT CACHED','TEXT','false',regex:.* > There are a ton of tests that use SHOW TABLE STATS or SHOW PARTITIONS. I ha what's the reason for the small deviations here, rounding? people might think that something has gone wrong if the extrapolation numbers are different right after you ran compute stats, would be nice to avoid that. -- To view, visit http://gerrit.cloudera.org:8080/6840 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I972c8a03ed70211734631a7dc9085cb33622ebc4 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com> Gerrit-HasComments: Yes