[ https://issues.apache.org/jira/browse/IMPALA-7659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
bharath v updated IMPALA-7659: ------------------------------ Affects Version/s: Impala 3.1.0 Impala 3.0 Impala 2.12.0 > Collect count of nulls when collecting stats > -------------------------------------------- > > Key: IMPALA-7659 > URL: https://issues.apache.org/jira/browse/IMPALA-7659 > Project: IMPALA > Issue Type: Bug > Components: Backend, Frontend > Affects Versions: Impala 3.0, Impala 2.12.0, Impala 3.1.0 > Reporter: Piotr Findeisen > Assignee: bharath v > Priority: Major > Fix For: Impala 3.2.0 > > > When Impala calculates table stats, NULL count gets overridden with -1. > Number of NULLs in a table is a useful information. Even if Impala does not > benefit from this information, some other tools do. Thus, not collecting this > information may pose a problem for Impala users (potentially forcing them to > run COMPUTE STATS elsewhere). > Now, counting NULLs should be an operation that is cheaper than counting > NDVs. However, code comment in {{ComputeStatsStmt.java}} suggests otherwise > ([~tarmstrong] suggested this is because of IMPALA-7655). > My suggestion would be to > - improve expression used to collect NULL count > - collect NULL count during COMPUTE STATS -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org