[ https://issues.apache.org/jira/browse/IMPALA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715789#comment-16715789 ]
ASF subversion and git services commented on IMPALA-1003: --------------------------------------------------------- Commit 04d027df13e1c3c5c654b5a0bc965b670483b535 in impala's branch refs/heads/master from Bharath Vissapragada [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=04d027d ] IMPALA-7659: Populate NULL count while computing column stats It was disabled for performance reasons (IMPALA-1003) and this patch re-enables it since a lot of codegen improvements have happened since then. This patch switches the aggregation to use the CASE conditional instead of IF since the former has proper codegen support (IMPALA-7655). Tests: ===== - Updated the affected tests to include the null counts. - Added unit tests that verify IS [NOT] NULL predicates' cardinality estimation. Perf note: ========= I reran the compute stats child query with null counts included on the store_sales table from 1000 SF (1TB) tpcds dataset. The table had 22 non-partitioned columns (on which null counts were computed) and ~2.8B rows. This experiment showed around 7-8% perf drop compared to the same child query without null counts for these columns. Change-Id: Ic68f8b4c3756eb1980ce299a602a7d56db1e507a Reviewed-on: http://gerrit.cloudera.org:8080/11565 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Improve compute stats performance > --------------------------------- > > Key: IMPALA-1003 > URL: https://issues.apache.org/jira/browse/IMPALA-1003 > Project: IMPALA > Issue Type: Improvement > Affects Versions: Impala 1.3 > Reporter: Matthew Jacobs > Assignee: Ippokratis Pandis > Priority: Major > Fix For: Impala 1.4 > > > We should remove unnecessary computations from the compute stats query and > use more codegen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org