Hari Sekhon created HIVE-7782:
---------------------------------

             Summary: tez default engine not overridden by 
hive.execution.engine=mr in hive cli session
                 Key: HIVE-7782
                 URL: https://issues.apache.org/jira/browse/HIVE-7782
             Project: Hive
          Issue Type: Bug
          Components: CLI, Tez
         Environment: HDP2.1
            Reporter: Hari Sekhon
            Priority: Minor


I've deployed hive.execution.engine=tez as the default on my secondary HDP 
cluster I find that hive cli interactive sessions where I do
{code}
set hive.execution.engine=mr
{code}
still execute with Tez as shown in the Resource Manager applications view. Now 
this may make sense since it's connected a Tez session by that point but it's 
also misleading because the job progress output in the cli changes to look like 
MapReduce rather than Tez and the query time is increased although only to 
15-16 secs rather than the 25-30+ secs I usually see with MR. The Resource 
Manager shows both of these jobs as TEZ application type. Is this a bug in the 
way Hive is submitting the job (Tez vs MR) or a bug in the way the RM is 
reporting it?
{code}
hive

Logging initialized using configuration in 
file:/etc/hive/conf.dist/hive-log4j.properties
hive> select count(*) from sample_07;
Query ID = hari_20140819164848_c03824c7-0e76-4507-b619-6a22cb0fbc4c
Total jobs = 1
Launching Job 1 out of 1


Status: Running (application id: application_1408444369445_0031)

Map 1: -/-      Reducer 2: 0/1
Map 1: 0/1      Reducer 2: 0/1
Map 1: 0/1      Reducer 2: 0/1
Map 1: 1/1      Reducer 2: 0/1
Map 1: 1/1      Reducer 2: 1/1
Status: Finished successfully
OK
823
Time taken: 8.492 seconds, Fetched: 1 row(s)
hive> set hive.execution.engine=mr;
hive> select count(*) from sample_07;
Query ID = hari_20140819164848_b620d990-b405-479c-be5b-d9616527cefe
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1408444369445_0032, Tracking URL = 
http://lonsl1101827-data.uk.net.intra:8088/proxy/application_1408444369445_0032/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1408444369445_0032
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2014-08-19 16:48:35,242 Stage-1 map = 0%,  reduce = 0%
2014-08-19 16:48:40,539 Stage-1 map = 100%,  reduce = 0%
2014-08-19 16:48:44,676 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_1408444369445_0032
MapReduce Jobs Launched:
Job 0:  HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
823
Time taken: 16.579 seconds, Fetched: 1 row(s)
{code}
If I exit hive shell and restart it instead using {code}--hiveconf 
hive.execution.engine=mr{code} to set before session is established then it 
does a proper MapReduce job according to RM and it also takes the longer 
expected 25 secs instead of the 8 in Tez or 15 in trying to do MR instead Tez 
session.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to