Hi Team,


When I’m evaluating the execution of queries submitted to a view, I noticed
the following issue.



*Problem:*

Impala fails to scan only the correct base table when a query is submitted
to the view.

Consider following Example

Let’s say we create two tables and a view as follows.

*CREATE* *TABLE* TEST_TABLE_1(

COL1 *BIGINT*,

COL_RANGE *BIGINT*

)



*CREATE* *TABLE* TEST_TABLE_2(

COL1 *BIGINT*,

COL_RANGE *BIGINT*

)



*CREATE* *VIEW* TEST_VIEW *AS*

*SELECT* COL1, COL_RANGE *FROM* TEST_TABLE_1 *WHERE* COL_RANGE <= 5

*UNION* *ALL*

*SELECT* COL1, COL_RANGE *FROM* TEST_TABLE_2 *WHERE* COL_RANGE > 5



If we submit a select query with only one equal condition for ‘COL_RANGE’,
impala scans only the correct base table.

*SELECT* * *FROM* TEST_VIEW *WHERE* COL_RANGE = 2

Operator              #Hosts  Avg Time  Max Time  #Rows  Est. #Rows  Peak
Mem  Est. Peak Mem  Detail

-------------------------------------------------------------------------------------------------------------------

F02:ROOT                   1   0.000ns   0.000ns
0              0

03:EXCHANGE                1   0.000ns   0.000ns      1           1  16.00
KB       16.00 KB  UNPARTITIONED

F01:EXCHANGE SENDER        1   0.000ns   0.000ns                      7.52
KB              0

00:UNION                   1   0.000ns   0.000ns      1           1
0              0

01:SCAN LOCAL              1   6.999ms   6.999ms      1           1  49.00
KB       32.00 MB  default.test_table_1



*SELECT* * *FROM* TEST_VIEW *WHERE* COL_RANGE = 7

Operator              #Hosts   Avg Time   Max Time  #Rows  Est. #Rows
Peak Mem  Est. Peak Mem  Detail

---------------------------------------------------------------------------------------------------------------------

F02:ROOT                   1    0.000ns    0.000ns
       0              0

03:EXCHANGE                1    0.000ns    0.000ns      1           1
16.00 KB       16.00 KB  UNPARTITIONED

F01:EXCHANGE SENDER        1  999.987us  999.987us
 7.52 KB              0

00:UNION                   1    0.000ns    0.000ns      1           1
       0              0

02:SCAN LOCAL              1    4.999ms    4.999ms      1           1
49.00 KB       32.00 MB  default.test_table_2





But if we submit a query with a range condition, it always scan both the
tables.

*SELECT* * *FROM* TEST_VIEW *WHERE* COL_RANGE < 4

Operator              #Hosts  Avg Time  Max Time  #Rows  Est. #Rows
Peak Mem  Est. Peak Mem  Detail

-------------------------------------------------------------------------------------------------------------------

F03:ROOT                   1   0.000ns   0.000ns
     0              0

03:EXCHANGE                1   0.000ns   0.000ns      2           2
16.00 KB       16.00 KB  UNPARTITIONED

F02:EXCHANGE SENDER        1   0.000ns   0.000ns
7.52 KB              0

00:UNION                   1   0.000ns   0.000ns      2           2
     0              0

|--02:SCAN LOCAL           1   2.999ms   2.999ms      0           1
41.00 KB       32.00 MB  default.test_table_2

01:SCAN LOCAL              1   4.999ms   4.999ms      2           1
49.00 KB       32.00 MB  default.test_table_1



*SELECT* * *FROM* TEST_VIEW *WHERE* COL_RANGE > 5

Operator              #Hosts  Avg Time  Max Time  #Rows  Est. #Rows
Peak Mem  Est. Peak Mem  Detail

-------------------------------------------------------------------------------------------------------------------

F03:ROOT                   1   0.000ns   0.000ns
     0              0

03:EXCHANGE                1   0.000ns   0.000ns      3           2
16.00 KB       16.00 KB  UNPARTITIONED

F02:EXCHANGE SENDER        1   1.000ms   1.000ms
7.52 KB              0

00:UNION                   1   0.000ns   0.000ns      3           2
     0              0

|--02:SCAN LOCAL           1   5.000ms   5.000ms      3           1
49.00 KB       32.00 MB  default.test_table_2

01:SCAN LOCAL              1   7.000ms   7.000ms      0           1
41.00 KB       32.00 MB  default.test_table_1







I would be really grateful if you could check on this and advise how to
overcome this issue.

Thanks & Regards

*Vibhath Ileperuma*

Reply via email to