[
https://issues.apache.org/jira/browse/IMPALA-12737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825833#comment-17825833
]
Michael Smith edited comment on IMPALA-12737 at 3/12/24 10:57 PM:
------------------------------------------------------------------
I don't stats on keys in "order by" are important for this, but want to double
check.
For the example above, we'd want to list:
* sys.tbls.db_id
* sys.dbs.db_id
* sys.dbs.name
* sys.table_stats_view.tbl_id
* sys.table_stats_view.num_rows
was (Author: JIRAUSER288956):
I don't stats on keys in "order by" are important for this, but want to double
check.
> Include List of Referenced Columns in Query Log Table
> -----------------------------------------------------
>
> Key: IMPALA-12737
> URL: https://issues.apache.org/jira/browse/IMPALA-12737
> Project: IMPALA
> Issue Type: Bug
> Reporter: Manish Maheshwari
> Assignee: Michael Smith
> Priority: Major
> Labels: workload-management
>
> In the Impala query log table where completed queries are stored, add a list
> of all columns that were referenced in the query. The purpose behind this
> functionality is to know which columns are part of
> * Select clause
> * Where clause
> * Join clause
> * Aggegrate clause
> With this information, we will feed into compute stats command to collect
> stats only on the required columns that are using in joins / filters and
> aggegrates and not on all the table columns.
> The information can be collected as an array of
> [db1.table1.column1,db1.table1.column2]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]