[ https://issues.apache.org/jira/browse/HIVE-17102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
anubhav tarar updated HIVE-17102: --------------------------------- Description: i tried to do vectorized execution in hive by using hive cwiki but example do not seems to work step1:created a orc table hive> create table Addresses ( > name string, > street string, > city string, > state string, > zip int > ) stored as orc tblproperties ("orc.compress"="NONE"); step2:insert the values in table hive> insert into Addresses values('anubhav','ggn','ggn','haryana','122001'); Query ID = hduser_20170716093152_14774003-d2c4-4620-b773-ca17cafd902b Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Listening for transport dt_socket at address: 5005 Job running in-process (local Hadoop) 2017-07-16 09:31:59,689 Stage-1 map = 100%, reduce = 0% Ended Job = job_local1858411694_0004 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to: hdfs://localhost:54310/user/hive/warehouse/addresses/.hive-staging_hive_2017-07-16_09-31-52_428_7861150459629073282-1/-ext-10000 Loading data to table default.addresses Table default.addresses stats: [numFiles=1, numRows=1, totalSize=713, rawDataSize=360] MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 778 HDFS Write: 818 SUCCESS Total MapReduce CPU Time Spent: 0 msec step3:query the table with explain command hive> set hive.vectorized.execution.enabled = true; hive> explain select name from Addresses where zip>1; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: addresses Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (zip > 1) (type: boolean) Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: name (type: string) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column stats: NONE ListSink Time taken: 0.081 seconds, Fetched: 20 row(s) note:in explain command there is not vectorized reader applied reason for failiure is that When Fetch is used in the plan instead of Map, it do not vectorize was: i tried to do vectorized execution in hive by using hive cwiki but example do not seems to work step1:created a orc table hive> create table Addresses ( > name string, > street string, > city string, > state string, > zip int > ) stored as orc tblproperties ("orc.compress"="NONE"); step2:insert the values in table hive> insert into Addresses values('anubhav','ggn','ggn','haryana','122001'); Query ID = hduser_20170716093152_14774003-d2c4-4620-b773-ca17cafd902b Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Listening for transport dt_socket at address: 5005 Job running in-process (local Hadoop) 2017-07-16 09:31:59,689 Stage-1 map = 100%, reduce = 0% Ended Job = job_local1858411694_0004 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to: hdfs://localhost:54310/user/hive/warehouse/addresses/.hive-staging_hive_2017-07-16_09-31-52_428_7861150459629073282-1/-ext-10000 Loading data to table default.addresses Table default.addresses stats: [numFiles=1, numRows=1, totalSize=713, rawDataSize=360] MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 778 HDFS Write: 818 SUCCESS Total MapReduce CPU Time Spent: 0 msec step3:query the table with explain command hive> set hive.vectorized.execution.enabled = true; hive> explain select name from Addresses where zip>1; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: addresses Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (zip > 1) (type: boolean) Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: name (type: string) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column stats: NONE ListSink Time taken: 0.081 seconds, Fetched: 20 row(s) note:in explain command there is not vectorized reader applied i updated hive cwiki for the same https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution > Example For Vectorized Execution in Hive in Cwiki not Seems to Work > ------------------------------------------------------------------- > > Key: HIVE-17102 > URL: https://issues.apache.org/jira/browse/HIVE-17102 > Project: Hive > Issue Type: Bug > Components: Documentation > Affects Versions: 1.2.0 > Reporter: anubhav tarar > Assignee: anubhav tarar > > i tried to do vectorized execution in hive by using hive cwiki but example do > not seems to work > step1:created a orc table > hive> create table Addresses ( > > name string, > > street string, > > city string, > > state string, > > zip int > > ) stored as orc tblproperties ("orc.compress"="NONE"); > step2:insert the values in table > hive> insert into Addresses values('anubhav','ggn','ggn','haryana','122001'); > Query ID = hduser_20170716093152_14774003-d2c4-4620-b773-ca17cafd902b > Total jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Listening for transport dt_socket at address: 5005 > Job running in-process (local Hadoop) > 2017-07-16 09:31:59,689 Stage-1 map = 100%, reduce = 0% > Ended Job = job_local1858411694_0004 > Stage-4 is selected by condition resolver. > Stage-3 is filtered out by condition resolver. > Stage-5 is filtered out by condition resolver. > Moving data to: > hdfs://localhost:54310/user/hive/warehouse/addresses/.hive-staging_hive_2017-07-16_09-31-52_428_7861150459629073282-1/-ext-10000 > Loading data to table default.addresses > Table default.addresses stats: [numFiles=1, numRows=1, totalSize=713, > rawDataSize=360] > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 778 HDFS Write: 818 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > step3:query the table with explain command > hive> set hive.vectorized.execution.enabled = true; > hive> explain select name from Addresses where zip>1; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: addresses > Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE Column > stats: NONE > Filter Operator > predicate: (zip > 1) (type: boolean) > Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: name (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 360 Basic stats: COMPLETE > Column stats: NONE > ListSink > Time taken: 0.081 seconds, Fetched: 20 row(s) > note:in explain command there is not vectorized reader applied > reason for failiure is that When Fetch is used in the plan instead of Map, it > do not vectorize -- This message was sent by Atlassian JIRA (v6.4.14#64029)