Why is a single INSERT very slow in Hive?

Jinhui Qin Mon, 11 Sep 2017 12:09:13 -0700

Hi,

I am new to Hive. I just created a simple table in hive and inserted two
records, the first insertion took 16.4 sec, while the second took 14.3 sec.
Why is that very slow? is this the normal performance you get in Hive using
INSERT ? Is there a way to improve the performance of a single "insert" in
Hive? Any help would be really appreciated. Thanks!


Here is the record from a terminal in Hive shell:

=========================

hive> show tables;
OK
Time taken: 2.758 seconds
hive> create table people(id int, name string, age int);
OK
Time taken: 0.283 seconds
hive> insert into table people(1,'Tom A', 20);
Query ID = hive_20170911134052_04680c79-432a-43e0-827b-29a4212fbbc0
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1505146047428_0098, Tracking URL = http://iop-hadoop-bi.
novalocal:8088/proxy/application_1505146047428_0098/
Kill Command = /usr/iop/4.1.0.0/hadoop/bin/hadoop job  -kill
job_1505146047428_0098
Hadoop job information for Stage-1: number of mappers: 1; number of
reducers: 0
2017-09-11 13:41:01,492 Stage-1 map = 0%,  reduce = 0%
2017-09-11 13:41:06,940 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
2.7 sec
MapReduce Total cumulative CPU time: 2 seconds 700 msec
Ended Job = job_1505146047428_0098
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://iop-hadoop-bi.novalocal:8020/apps/hive/
warehouse/people/.hive-staging_hive_2017-09-11_13-40-
52_106_462156758110461544
1-1/-ext-10000
Loading data to table default.people
Table default.people stats: [numFiles=1, numRows=1, totalSize=11,
rawDataSize=10]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1   Cumulative CPU: 2.7 sec   HDFS Read: 3836 HDFS
Write: 81 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 700 msec
OK
Time taken: 16.417 seconds
hive> insert into table people values(1,'Tom A', 20);
Query ID = hive_20170911134128_c8f46977-7718-4496-9a98-cce0f89ced79
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1505146047428_0099, Tracking URL = http://iop-hadoop-bi.
novalocal:8088/proxy/application_1505146047428_0099/
Kill Command = /usr/iop/4.1.0.0/hadoop/bin/hadoop job  -kill
job_1505146047428_0099
Hadoop job information for Stage-1: number of mappers: 1; number of
reducers: 0
2017-09-11 13:41:36,289 Stage-1 map = 0%,  reduce = 0%
2017-09-11 13:41:40,721 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
2.28 sec
MapReduce Total cumulative CPU time: 2 seconds 280 msec
Ended Job = job_1505146047428_0099
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://iop-hadoop-bi.novalocal:8020/apps/hive/
warehouse/people/.hive-staging_hive_2017-09-11_13-41-
28_757_445847252207124056
7-1/-ext-10000
Loading data to table default.people
Table default.people stats: [numFiles=2, numRows=2, totalSize=22,
rawDataSize=20]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1   Cumulative CPU: 2.28 sec   HDFS Read: 3924 HDFS
Write: 81 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 280 msec
OK
Time taken: 14.288 seconds
hive> exit;
=================


Jinhui

Why is a single INSERT very slow in Hive?

Reply via email to