Hi Team,
When I’m evaluating the execution of queries submitted to a view, I noticed the following issue. *Problem:* Impala fails to scan only the correct base table when a query is submitted to the view. Consider following Example Let’s say we create two tables and a view as follows. *CREATE* *TABLE* TEST_TABLE_1( COL1 *BIGINT*, COL_RANGE *BIGINT* ) *CREATE* *TABLE* TEST_TABLE_2( COL1 *BIGINT*, COL_RANGE *BIGINT* ) *CREATE* *VIEW* TEST_VIEW *AS* *SELECT* COL1, COL_RANGE *FROM* TEST_TABLE_1 *WHERE* COL_RANGE <= 5 *UNION* *ALL* *SELECT* COL1, COL_RANGE *FROM* TEST_TABLE_2 *WHERE* COL_RANGE > 5 If we submit a select query with only one equal condition for ‘COL_RANGE’, impala scans only the correct base table. *SELECT* * *FROM* TEST_VIEW *WHERE* COL_RANGE = 2 Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ------------------------------------------------------------------------------------------------------------------- F02:ROOT 1 0.000ns 0.000ns 0 0 03:EXCHANGE 1 0.000ns 0.000ns 1 1 16.00 KB 16.00 KB UNPARTITIONED F01:EXCHANGE SENDER 1 0.000ns 0.000ns 7.52 KB 0 00:UNION 1 0.000ns 0.000ns 1 1 0 0 01:SCAN LOCAL 1 6.999ms 6.999ms 1 1 49.00 KB 32.00 MB default.test_table_1 *SELECT* * *FROM* TEST_VIEW *WHERE* COL_RANGE = 7 Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail --------------------------------------------------------------------------------------------------------------------- F02:ROOT 1 0.000ns 0.000ns 0 0 03:EXCHANGE 1 0.000ns 0.000ns 1 1 16.00 KB 16.00 KB UNPARTITIONED F01:EXCHANGE SENDER 1 999.987us 999.987us 7.52 KB 0 00:UNION 1 0.000ns 0.000ns 1 1 0 0 02:SCAN LOCAL 1 4.999ms 4.999ms 1 1 49.00 KB 32.00 MB default.test_table_2 But if we submit a query with a range condition, it always scan both the tables. *SELECT* * *FROM* TEST_VIEW *WHERE* COL_RANGE < 4 Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ------------------------------------------------------------------------------------------------------------------- F03:ROOT 1 0.000ns 0.000ns 0 0 03:EXCHANGE 1 0.000ns 0.000ns 2 2 16.00 KB 16.00 KB UNPARTITIONED F02:EXCHANGE SENDER 1 0.000ns 0.000ns 7.52 KB 0 00:UNION 1 0.000ns 0.000ns 2 2 0 0 |--02:SCAN LOCAL 1 2.999ms 2.999ms 0 1 41.00 KB 32.00 MB default.test_table_2 01:SCAN LOCAL 1 4.999ms 4.999ms 2 1 49.00 KB 32.00 MB default.test_table_1 *SELECT* * *FROM* TEST_VIEW *WHERE* COL_RANGE > 5 Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ------------------------------------------------------------------------------------------------------------------- F03:ROOT 1 0.000ns 0.000ns 0 0 03:EXCHANGE 1 0.000ns 0.000ns 3 2 16.00 KB 16.00 KB UNPARTITIONED F02:EXCHANGE SENDER 1 1.000ms 1.000ms 7.52 KB 0 00:UNION 1 0.000ns 0.000ns 3 2 0 0 |--02:SCAN LOCAL 1 5.000ms 5.000ms 3 1 49.00 KB 32.00 MB default.test_table_2 01:SCAN LOCAL 1 7.000ms 7.000ms 0 1 41.00 KB 32.00 MB default.test_table_1 I would be really grateful if you could check on this and advise how to overcome this issue. Thanks & Regards *Vibhath Ileperuma*