[
https://issues.apache.org/jira/browse/HIVE-17102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
anubhav tarar updated HIVE-17102:
---------------------------------
Description:
i tried to do vectorized execution in hive by using hive cwiki but example do
not seems to work
step1:created a orc table
hive> create table Addresses (
> name string,
> street string,
> city string,
> state string,
> zip int
> ) stored as orc tblproperties ("orc.compress"="NONE");
step2:insert the values in table
hive> insert into Addresses values('anubhav','ggn','ggn','haryana','122001');
Query ID = hduser_20170716093152_14774003-d2c4-4620-b773-ca17cafd902b
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Listening for transport dt_socket at address: 5005
Job running in-process (local Hadoop)
2017-07-16 09:31:59,689 Stage-1 map = 100%, reduce = 0%
Ended Job = job_local1858411694_0004
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to:
hdfs://localhost:54310/user/hive/warehouse/addresses/.hive-staging_hive_2017-07-16_09-31-52_428_7861150459629073282-1/-ext-10000
Loading data to table default.addresses
Table default.addresses stats: [numFiles=1, numRows=1, totalSize=713,
rawDataSize=360]
MapReduce Jobs Launched:
Stage-Stage-1: HDFS Read: 778 HDFS Write: 818 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
step3:query the table with explain command
hive> set hive.vectorized.execution.enabled = true;
hive> explain select name from Addresses where zip>1;
OK
STAGE DEPENDENCIES:
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
TableScan
alias: addresses
Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column
stats: NONE
Filter Operator
predicate: (zip > 1) (type: boolean)
Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column
stats: NONE
Select Operator
expressions: name (type: string)
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE
Column stats: NONE
ListSink
Time taken: 0.081 seconds, Fetched: 20 row(s)
note:in explain command there is not vectorized reader applied
reason for failiure is that When Fetch is used in the plan instead of Map, it
do not vectorize
was:
i tried to do vectorized execution in hive by using hive cwiki but example do
not seems to work
step1:created a orc table
hive> create table Addresses (
> name string,
> street string,
> city string,
> state string,
> zip int
> ) stored as orc tblproperties ("orc.compress"="NONE");
step2:insert the values in table
hive> insert into Addresses values('anubhav','ggn','ggn','haryana','122001');
Query ID = hduser_20170716093152_14774003-d2c4-4620-b773-ca17cafd902b
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Listening for transport dt_socket at address: 5005
Job running in-process (local Hadoop)
2017-07-16 09:31:59,689 Stage-1 map = 100%, reduce = 0%
Ended Job = job_local1858411694_0004
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to:
hdfs://localhost:54310/user/hive/warehouse/addresses/.hive-staging_hive_2017-07-16_09-31-52_428_7861150459629073282-1/-ext-10000
Loading data to table default.addresses
Table default.addresses stats: [numFiles=1, numRows=1, totalSize=713,
rawDataSize=360]
MapReduce Jobs Launched:
Stage-Stage-1: HDFS Read: 778 HDFS Write: 818 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
step3:query the table with explain command
hive> set hive.vectorized.execution.enabled = true;
hive> explain select name from Addresses where zip>1;
OK
STAGE DEPENDENCIES:
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
TableScan
alias: addresses
Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column
stats: NONE
Filter Operator
predicate: (zip > 1) (type: boolean)
Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column
stats: NONE
Select Operator
expressions: name (type: string)
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE
Column stats: NONE
ListSink
Time taken: 0.081 seconds, Fetched: 20 row(s)
note:in explain command there is not vectorized reader applied
i updated hive cwiki for the same
https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution
> Example For Vectorized Execution in Hive in Cwiki not Seems to Work
> -------------------------------------------------------------------
>
> Key: HIVE-17102
> URL: https://issues.apache.org/jira/browse/HIVE-17102
> Project: Hive
> Issue Type: Bug
> Components: Documentation
> Affects Versions: 1.2.0
> Reporter: anubhav tarar
> Assignee: anubhav tarar
>
> i tried to do vectorized execution in hive by using hive cwiki but example do
> not seems to work
> step1:created a orc table
> hive> create table Addresses (
> > name string,
> > street string,
> > city string,
> > state string,
> > zip int
> > ) stored as orc tblproperties ("orc.compress"="NONE");
> step2:insert the values in table
> hive> insert into Addresses values('anubhav','ggn','ggn','haryana','122001');
> Query ID = hduser_20170716093152_14774003-d2c4-4620-b773-ca17cafd902b
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Listening for transport dt_socket at address: 5005
> Job running in-process (local Hadoop)
> 2017-07-16 09:31:59,689 Stage-1 map = 100%, reduce = 0%
> Ended Job = job_local1858411694_0004
> Stage-4 is selected by condition resolver.
> Stage-3 is filtered out by condition resolver.
> Stage-5 is filtered out by condition resolver.
> Moving data to:
> hdfs://localhost:54310/user/hive/warehouse/addresses/.hive-staging_hive_2017-07-16_09-31-52_428_7861150459629073282-1/-ext-10000
> Loading data to table default.addresses
> Table default.addresses stats: [numFiles=1, numRows=1, totalSize=713,
> rawDataSize=360]
> MapReduce Jobs Launched:
> Stage-Stage-1: HDFS Read: 778 HDFS Write: 818 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> step3:query the table with explain command
> hive> set hive.vectorized.execution.enabled = true;
> hive> explain select name from Addresses where zip>1;
> OK
> STAGE DEPENDENCIES:
> Stage-0 is a root stage
> STAGE PLANS:
> Stage: Stage-0
> Fetch Operator
> limit: -1
> Processor Tree:
> TableScan
> alias: addresses
> Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column
> stats: NONE
> Filter Operator
> predicate: (zip > 1) (type: boolean)
> Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE
> Column stats: NONE
> Select Operator
> expressions: name (type: string)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE
> Column stats: NONE
> ListSink
> Time taken: 0.081 seconds, Fetched: 20 row(s)
> note:in explain command there is not vectorized reader applied
> reason for failiure is that When Fetch is used in the plan instead of Map, it
> do not vectorize
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)