[jira] [Created] (IMPALA-10723) Allow querying and computing stats on a materialized view

Aman Sinha (Jira) Sun, 30 May 2021 20:25:05 -0700

Aman Sinha created IMPALA-10723:
-----------------------------------

             Summary: Allow querying and computing stats on a materialized view
                 Key: IMPALA-10723
                 URL: https://issues.apache.org/jira/browse/IMPALA-10723
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: Aman Sinha
            Assignee: Aman Sinha



Currently, in Impala, a Materialized View (MV) created via Hive is visible 
through the metadata catalog and can be queried but the query is expanded to 
its corresponding view definition.  This is incorrect because a materialized 
view is a regular (physical) table, not a view.  Even though Impala does not 
support either creating MV or automatic rewriting to use MV, querying an MV 
directly should be allowed. 

Here's the current behavior:
{noformat}
[localhost:21050] functional> explain select * from materialized_view;
Query: explain select * from materialized_view
+------------------------------------------------------------------------------------+
| Explain String                                                                
     |
+------------------------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=4.00MB Threads=2                    
     |
| Per-Host Resource Estimates: Memory=10MB                                      
     |
| WARNING: The following tables are missing relevant table and/or column 
statistics. |
| functional.insert_only_transactional_table                                    
     |
|                                                                               
     |
| PLAN-ROOT SINK                                                                
     |
| |                                                                             
     |
| 01:EXCHANGE [UNPARTITIONED]                                                   
     |
| |                                                                             
     |
| 00:SCAN HDFS [functional.insert_only_transactional_table]                     
     |
|    HDFS partitions=1/1 files=0 size=0B                                        
     |
|    row-size=4B cardinality=0                                                  
     |
+------------------------------------------------------------------------------------+
{noformat}

Note that the plan shows the scan of the underlying table instead of the 
materialized_view table. We should only be scanning the MV (including applying 
partition pruning, predicate pushdown etc.) and not treating this as a view. 

This JIRA is to enhance the frontend to recognize a materialized view as a 
table rather than a view.  This will further allow commands such as COMPUTE 
STATS, DROP STATS, SHOW [TABLE | COLUMN] STATS to be run on the MV.  One 
motivation for doing this is to allow an external frontend that supports 
automatic query rewrites using MVs to leverage the statistics on MVs. 

Since Impala is not creating the MV, we will need to block DML operations on 
the MV.  Further,  special handling needs to be done for Ranger authorization 
policies such that any column masking/row filtering policies defined on the 
source tables of the MV are taken into consideration. 









--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (IMPALA-10723) Allow querying and computing stats on a materialized view

Reply via email to