[
https://issues.apache.org/jira/browse/HIVE-24897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated HIVE-24897:
--------------------------------
Description:
like an ordinary value = 'xy', IS NULL might also work when we filter on the
partitioning column
{code}
EXPLAIN VECTORIZATION DETAIL select p_mfgr, p_name, p_timestamp, rowindex,
p_date, p_retailprice,
count(*) over(partition by p_timestamp) as cs,
sum(p_retailprice) over(partition by p_timestamp) as s
from vector_ptf_part_simple_orc
where p_timestamp IS NULL;
{code}
{code}
Reduce Vectorization: |
| enabled: true |
| enableConditionsMet: hive.vectorized.execution.reduce.enabled
IS true, hive.execution.engine tez IN [tez, spark] IS true |
| notVectorizedReason: PTF operator: Unexpected hive type name
void |
| vectorized: false |
| Reduce Operator Tree: |
| Select Operator |
| expressions: VALUE._col0 (type: string), VALUE._col1 (type:
string), VALUE._col2 (type: date), VALUE._col4 (type: double), VALUE._col6
(type: int) |
| outputColumnNames: _col0, _col1, _col2, _col4, _col6 |
| Statistics: Num rows: 1 Data size: 556 Basic stats: COMPLETE
Column stats: COMPLETE |
| PTF Operator |
| Function definitions: |
| Input definition |
| input alias: ptf_0 |
| output shape: _col0: string, _col1: string, _col2:
date, _col4: double, _col6: int |
| type: WINDOWING |
| Windowing table definition |
| input alias: ptf_1 |
| name: windowingtablefunction |
| order by: null ASC NULLS FIRST |
| partition by: CAST( null AS TIMESTAMP) |
| raw input shape: |
| window functions: |
| window function definition |
| alias: count_window_0 |
| name: count |
| window function: GenericUDAFCountEvaluator |
| window frame: ROWS
PRECEDING(MAX)~FOLLOWING(MAX) |
| isStar: true |
| window function definition |
| alias: sum_window_1 |
| arguments: _col4 |
| name: sum |
| window function: GenericUDAFSumDouble |
| window frame: ROWS
PRECEDING(MAX)~FOLLOWING(MAX) |
{code}
the result consists of all p_timestamp=NULL rows:
{code}
+-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
| p_mfgr | p_name | p_timestamp |
rowindex | p_date | p_retailprice | cs | s |
+-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
| Manufacturer#2 | almond aquamarine rose maroon antique | NULL |
1 | NULL | 900.66 | 8 | 7891.070000000001 |
| Manufacturer#5 | almond antique blue firebrick mint | NULL |
7 | NULL | 1789.69 | 8 | 7891.070000000001 |
| Manufacturer#2 | almond antique violet turquoise frosted | NULL |
13 | NULL | 1800.7 | 8 | 7891.070000000001 |
| Manufacturer#3 | almond antique forest lavender goldenrod | NULL |
19 | NULL | 590.27 | 8 | 7891.070000000001 |
| Manufacturer#4 | almond antique gainsboro frosted violet | NULL |
25 | NULL | NULL | 8 | 7891.070000000001 |
| Manufacturer#1 | almond antique chartreuse lavender yellow | NULL |
26 | NULL | 1753.76 | 8 | 7891.070000000001 |
| Manufacturer#2 | almond aquamarine sandy cyan gainsboro | NULL |
32 | NULL | 1000.6 | 8 | 7891.070000000001 |
| Manufacturer#3 | almond antique metallic orange dim | NULL |
38 | NULL | 55.39 | 8 | 7891.070000000001 |
+-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
{code}
> Is null filter on partitioning column leads to non-vectorized execution
> -----------------------------------------------------------------------
>
> Key: HIVE-24897
> URL: https://issues.apache.org/jira/browse/HIVE-24897
> Project: Hive
> Issue Type: Sub-task
> Reporter: László Bodor
> Priority: Major
>
> like an ordinary value = 'xy', IS NULL might also work when we filter on the
> partitioning column
> {code}
> EXPLAIN VECTORIZATION DETAIL select p_mfgr, p_name, p_timestamp, rowindex,
> p_date, p_retailprice,
> count(*) over(partition by p_timestamp) as cs,
> sum(p_retailprice) over(partition by p_timestamp) as s
> from vector_ptf_part_simple_orc
> where p_timestamp IS NULL;
> {code}
> {code}
> Reduce Vectorization: |
> | enabled: true |
> | enableConditionsMet:
> hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez
> IN [tez, spark] IS true |
> | notVectorizedReason: PTF operator: Unexpected hive type
> name void |
> | vectorized: false |
> | Reduce Operator Tree: |
> | Select Operator |
> | expressions: VALUE._col0 (type: string), VALUE._col1 (type:
> string), VALUE._col2 (type: date), VALUE._col4 (type: double), VALUE._col6
> (type: int) |
> | outputColumnNames: _col0, _col1, _col2, _col4, _col6 |
> | Statistics: Num rows: 1 Data size: 556 Basic stats:
> COMPLETE Column stats: COMPLETE |
> | PTF Operator |
> | Function definitions: |
> | Input definition |
> | input alias: ptf_0 |
> | output shape: _col0: string, _col1: string, _col2:
> date, _col4: double, _col6: int |
> | type: WINDOWING |
> | Windowing table definition |
> | input alias: ptf_1 |
> | name: windowingtablefunction |
> | order by: null ASC NULLS FIRST |
> | partition by: CAST( null AS TIMESTAMP) |
> | raw input shape: |
> | window functions: |
> | window function definition |
> | alias: count_window_0 |
> | name: count |
> | window function: GenericUDAFCountEvaluator |
> | window frame: ROWS
> PRECEDING(MAX)~FOLLOWING(MAX) |
> | isStar: true |
> | window function definition |
> | alias: sum_window_1 |
> | arguments: _col4 |
> | name: sum |
> | window function: GenericUDAFSumDouble |
> | window frame: ROWS
> PRECEDING(MAX)~FOLLOWING(MAX) |
> {code}
> the result consists of all p_timestamp=NULL rows:
> {code}
> +-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
> | p_mfgr | p_name | p_timestamp
> | rowindex | p_date | p_retailprice | cs | s |
> +-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
> | Manufacturer#2 | almond aquamarine rose maroon antique | NULL
> | 1 | NULL | 900.66 | 8 | 7891.070000000001 |
> | Manufacturer#5 | almond antique blue firebrick mint | NULL
> | 7 | NULL | 1789.69 | 8 | 7891.070000000001 |
> | Manufacturer#2 | almond antique violet turquoise frosted | NULL
> | 13 | NULL | 1800.7 | 8 | 7891.070000000001 |
> | Manufacturer#3 | almond antique forest lavender goldenrod | NULL
> | 19 | NULL | 590.27 | 8 | 7891.070000000001 |
> | Manufacturer#4 | almond antique gainsboro frosted violet | NULL
> | 25 | NULL | NULL | 8 | 7891.070000000001 |
> | Manufacturer#1 | almond antique chartreuse lavender yellow | NULL
> | 26 | NULL | 1753.76 | 8 | 7891.070000000001 |
> | Manufacturer#2 | almond aquamarine sandy cyan gainsboro | NULL
> | 32 | NULL | 1000.6 | 8 | 7891.070000000001 |
> | Manufacturer#3 | almond antique metallic orange dim | NULL
> | 38 | NULL | 55.39 | 8 | 7891.070000000001 |
> +-----------------+--------------------------------------------+--------------+-----------+---------+----------------+-----+--------------------+
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)