[ 
https://issues.apache.org/jira/browse/HIVE-24897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24897:
--------------------------------
    Description: 
like an ordinary value = 'xy', IS NULL might also work when we filter on the 
partitioning column
{code}
EXPLAIN VECTORIZATION DETAIL select p_mfgr, p_name, p_timestamp, rowindex, 
p_date, p_retailprice,
count(*) over(partition by p_timestamp) as cs,
sum(p_retailprice) over(partition by p_timestamp) as s
from vector_ptf_part_simple_orc
where p_timestamp IS NULL;
{code}

{code}
 Reduce Vectorization:                  |
|                 enabled: true                      |
|                 enableConditionsMet: hive.vectorized.execution.reduce.enabled 
IS true, hive.execution.engine tez IN [tez, spark] IS true |
|                 notVectorizedReason: PTF operator: Unexpected hive type name 
void |
|                 vectorized: false                  |
|             Reduce Operator Tree:                  |
|               Select Operator                      |
|                 expressions: VALUE._col0 (type: string), VALUE._col1 (type: 
string), VALUE._col2 (type: date), VALUE._col4 (type: double), VALUE._col6 
(type: int) |
|                 outputColumnNames: _col0, _col1, _col2, _col4, _col6 |
|                 Statistics: Num rows: 1 Data size: 556 Basic stats: COMPLETE 
Column stats: COMPLETE |
|                 PTF Operator                       |
|                   Function definitions:            |
|                       Input definition             |
|                         input alias: ptf_0         |
|                         output shape: _col0: string, _col1: string, _col2: 
date, _col4: double, _col6: int |
|                         type: WINDOWING            |
|                       Windowing table definition   |
|                         input alias: ptf_1         |
|                         name: windowingtablefunction |
|                         order by: null ASC NULLS FIRST |
|                         partition by: CAST( null AS TIMESTAMP) |
|                         raw input shape:           |
|                         window functions:          |
|                             window function definition |
|                               alias: count_window_0 |
|                               name: count          |
|                               window function: GenericUDAFCountEvaluator |
|                               window frame: ROWS 
PRECEDING(MAX)~FOLLOWING(MAX) |
|                               isStar: true         |
|                             window function definition |
|                               alias: sum_window_1  |
|                               arguments: _col4     |
|                               name: sum            |
|                               window function: GenericUDAFSumDouble |
|                               window frame: ROWS 
PRECEDING(MAX)~FOLLOWING(MAX) |
{code}

the result consists of all p_timestamp=NULL rows:
{code}
+-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
|     p_mfgr      |                   p_name                   | p_timestamp  | 
rowindex  | p_date  | p_retailprice  | cs  |         s          |
+-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
| Manufacturer#2  | almond aquamarine rose maroon antique      | NULL         | 
1         | NULL    | 900.66         | 8   | 7891.070000000001  |
| Manufacturer#5  | almond antique blue firebrick mint         | NULL         | 
7         | NULL    | 1789.69        | 8   | 7891.070000000001  |
| Manufacturer#2  | almond antique violet turquoise frosted    | NULL         | 
13        | NULL    | 1800.7         | 8   | 7891.070000000001  |
| Manufacturer#3  | almond antique forest lavender goldenrod   | NULL         | 
19        | NULL    | 590.27         | 8   | 7891.070000000001  |
| Manufacturer#4  | almond antique gainsboro frosted violet    | NULL         | 
25        | NULL    | NULL           | 8   | 7891.070000000001  |
| Manufacturer#1  | almond antique chartreuse lavender yellow  | NULL         | 
26        | NULL    | 1753.76        | 8   | 7891.070000000001  |
| Manufacturer#2  | almond aquamarine sandy cyan gainsboro     | NULL         | 
32        | NULL    | 1000.6         | 8   | 7891.070000000001  |
| Manufacturer#3  | almond antique metallic orange dim         | NULL         | 
38        | NULL    | 55.39          | 8   | 7891.070000000001  |
+-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
{code}

> Is null filter on partitioning column leads to non-vectorized execution
> -----------------------------------------------------------------------
>
>                 Key: HIVE-24897
>                 URL: https://issues.apache.org/jira/browse/HIVE-24897
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: László Bodor
>            Priority: Major
>
> like an ordinary value = 'xy', IS NULL might also work when we filter on the 
> partitioning column
> {code}
> EXPLAIN VECTORIZATION DETAIL select p_mfgr, p_name, p_timestamp, rowindex, 
> p_date, p_retailprice,
> count(*) over(partition by p_timestamp) as cs,
> sum(p_retailprice) over(partition by p_timestamp) as s
> from vector_ptf_part_simple_orc
> where p_timestamp IS NULL;
> {code}
> {code}
>  Reduce Vectorization:                  |
> |                 enabled: true                      |
> |                 enableConditionsMet: 
> hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez 
> IN [tez, spark] IS true |
> |                 notVectorizedReason: PTF operator: Unexpected hive type 
> name void |
> |                 vectorized: false                  |
> |             Reduce Operator Tree:                  |
> |               Select Operator                      |
> |                 expressions: VALUE._col0 (type: string), VALUE._col1 (type: 
> string), VALUE._col2 (type: date), VALUE._col4 (type: double), VALUE._col6 
> (type: int) |
> |                 outputColumnNames: _col0, _col1, _col2, _col4, _col6 |
> |                 Statistics: Num rows: 1 Data size: 556 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |                 PTF Operator                       |
> |                   Function definitions:            |
> |                       Input definition             |
> |                         input alias: ptf_0         |
> |                         output shape: _col0: string, _col1: string, _col2: 
> date, _col4: double, _col6: int |
> |                         type: WINDOWING            |
> |                       Windowing table definition   |
> |                         input alias: ptf_1         |
> |                         name: windowingtablefunction |
> |                         order by: null ASC NULLS FIRST |
> |                         partition by: CAST( null AS TIMESTAMP) |
> |                         raw input shape:           |
> |                         window functions:          |
> |                             window function definition |
> |                               alias: count_window_0 |
> |                               name: count          |
> |                               window function: GenericUDAFCountEvaluator |
> |                               window frame: ROWS 
> PRECEDING(MAX)~FOLLOWING(MAX) |
> |                               isStar: true         |
> |                             window function definition |
> |                               alias: sum_window_1  |
> |                               arguments: _col4     |
> |                               name: sum            |
> |                               window function: GenericUDAFSumDouble |
> |                               window frame: ROWS 
> PRECEDING(MAX)~FOLLOWING(MAX) |
> {code}
> the result consists of all p_timestamp=NULL rows:
> {code}
> +-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
> |     p_mfgr      |                   p_name                   | p_timestamp  
> | rowindex  | p_date  | p_retailprice  | cs  |         s          |
> +-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
> | Manufacturer#2  | almond aquamarine rose maroon antique      | NULL         
> | 1         | NULL    | 900.66         | 8   | 7891.070000000001  |
> | Manufacturer#5  | almond antique blue firebrick mint         | NULL         
> | 7         | NULL    | 1789.69        | 8   | 7891.070000000001  |
> | Manufacturer#2  | almond antique violet turquoise frosted    | NULL         
> | 13        | NULL    | 1800.7         | 8   | 7891.070000000001  |
> | Manufacturer#3  | almond antique forest lavender goldenrod   | NULL         
> | 19        | NULL    | 590.27         | 8   | 7891.070000000001  |
> | Manufacturer#4  | almond antique gainsboro frosted violet    | NULL         
> | 25        | NULL    | NULL           | 8   | 7891.070000000001  |
> | Manufacturer#1  | almond antique chartreuse lavender yellow  | NULL         
> | 26        | NULL    | 1753.76        | 8   | 7891.070000000001  |
> | Manufacturer#2  | almond aquamarine sandy cyan gainsboro     | NULL         
> | 32        | NULL    | 1000.6         | 8   | 7891.070000000001  |
> | Manufacturer#3  | almond antique metallic orange dim         | NULL         
> | 38        | NULL    | 55.39          | 8   | 7891.070000000001  |
> +-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to