Todd Lipcon has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13382 )

Change subject: IMPALA-8215, IMPALA-8458. Fix setting stats without setting 
NDVs in local-catalog mode
......................................................................

IMPALA-8215, IMPALA-8458. Fix setting stats without setting NDVs in 
local-catalog mode

This removes an old workaround in which columns with numDVs=-1 were
skipped when sending stats from catalogd to impalad. This workaround was
trying to fix some weird behavior with boolean stats but broke other
things, as noted in the JIRA. Additionally, it was used as a way to
avoid sending stats on HDFS clustering columns (which don't have
computed stats).

This change changes the workaround to be a bit more careful: we
explicitly ignore HDFS clustering columns, and for booleans, we use the
numNulls instead of numDVs to determine whether we have enough
information to infer numDVs.

NOTE: HBase table keys are also considered "clustering columns", but
those _do_ need computed stats, because HBase is range-partitioned, not
value-partitioned.

While I was in this section of the code, I stumbled upon IMPALA-8215, a
bug in which the 'numTrues' of boolean columns was getting set to '1'
for no apparent reason. I fixed that while I was in the area.

Change-Id: Ic0b95de22954c7ad6715143fc42a1506289c095f
Reviewed-on: http://gerrit.cloudera.org:8080/13382
Tested-by: Todd Lipcon <t...@apache.org>
Reviewed-by: Tim Armstrong <tarmstr...@cloudera.com>
---
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M tests/common/skip.py
M tests/metadata/test_ddl.py
5 files changed, 23 insertions(+), 14 deletions(-)

Approvals:
  Todd Lipcon: Verified
  Tim Armstrong: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/13382
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic0b95de22954c7ad6715143fc42a1506289c095f
Gerrit-Change-Number: 13382
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to