[ 
https://issues.apache.org/jira/browse/IMPALA-12737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825833#comment-17825833
 ] 

Michael Smith edited comment on IMPALA-12737 at 3/12/24 10:57 PM:
------------------------------------------------------------------

I don't stats on keys in "order by" are important for this, but want to double 
check.

For the example above, we'd want to list:
* sys.tbls.db_id
* sys.dbs.db_id
* sys.dbs.name
* sys.table_stats_view.tbl_id
* sys.table_stats_view.num_rows



was (Author: JIRAUSER288956):
I don't stats on keys in "order by" are important for this, but want to double 
check.

> Include List of Referenced Columns in Query Log Table
> -----------------------------------------------------
>
>                 Key: IMPALA-12737
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12737
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Manish Maheshwari
>            Assignee: Michael Smith
>            Priority: Major
>              Labels: workload-management
>
> In the Impala query log table where completed queries are stored, add a list 
> of all columns that were referenced in the query. The purpose behind this 
> functionality is to know which columns are part of 
>  * Select clause
>  * Where clause
>  * Join clause
>  * Aggegrate clause
> With this information, we will feed into compute stats command to collect 
> stats only on the required columns that are using in joins / filters and 
> aggegrates and not on all the table columns.
> The information can be collected as an array of 
> [db1.table1.column1,db1.table1.column2]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to