Why do you want to do single inserts? It has been more designed for bulk loads. In any case newer version of Hive 2 using TEZ +llap improve it significantly (also for bulk analysis). Nevertheless, it is good practice to not use single inserts in an analysis systems, but try to combine and bulk-load them.
> On 11. Sep 2017, at 21:01, Jinhui Qin <qin.jin...@gmail.com> wrote: > > > > Hi, > I am new to Hive. I just created a simple table in hive and inserted two > records, the first insertion took 16.4 sec, while the second took 14.3 sec. > Why is that very slow? is this the normal performance you get in Hive using > INSERT ? Is there a way to improve the performance of a single "insert" in > Hive? Any help would be really appreciated. Thanks! > > Here is the record from a terminal in Hive shell: > > ========================= > > hive> show tables; > OK > Time taken: 2.758 seconds > hive> create table people(id int, name string, age int); > OK > Time taken: 0.283 seconds > hive> insert into table people(1,'Tom A', 20); > Query ID = hive_20170911134052_04680c79-432a-43e0-827b-29a4212fbbc0 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1505146047428_0098, Tracking URL = > http://iop-hadoop-bi.novalocal:8088/proxy/application_1505146047428_0098/ > Kill Command = /usr/iop/4.1.0.0/hadoop/bin/hadoop job -kill > job_1505146047428_0098 > Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0 > 2017-09-11 13:41:01,492 Stage-1 map = 0%, reduce = 0% > 2017-09-11 13:41:06,940 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.7 > sec > MapReduce Total cumulative CPU time: 2 seconds 700 msec > Ended Job = job_1505146047428_0098 > Stage-4 is selected by condition resolver. > Stage-3 is filtered out by condition resolver. > Stage-5 is filtered out by condition resolver. > Moving data to: > hdfs://iop-hadoop-bi.novalocal:8020/apps/hive/warehouse/people/.hive-staging_hive_2017-09-11_13-40-52_106_462156758110461544 > 1-1/-ext-10000 > Loading data to table default.people > Table default.people stats: [numFiles=1, numRows=1, totalSize=11, > rawDataSize=10] > MapReduce Jobs Launched: > Stage-Stage-1: Map: 1 Cumulative CPU: 2.7 sec HDFS Read: 3836 HDFS Write: > 81 SUCCESS > Total MapReduce CPU Time Spent: 2 seconds 700 msec > OK > Time taken: 16.417 seconds > hive> insert into table people values(1,'Tom A', 20); > Query ID = hive_20170911134128_c8f46977-7718-4496-9a98-cce0f89ced79 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1505146047428_0099, Tracking URL = > http://iop-hadoop-bi.novalocal:8088/proxy/application_1505146047428_0099/ > Kill Command = /usr/iop/4.1.0.0/hadoop/bin/hadoop job -kill > job_1505146047428_0099 > Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0 > 2017-09-11 13:41:36,289 Stage-1 map = 0%, reduce = 0% > 2017-09-11 13:41:40,721 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.28 > sec > MapReduce Total cumulative CPU time: 2 seconds 280 msec > Ended Job = job_1505146047428_0099 > Stage-4 is selected by condition resolver. > Stage-3 is filtered out by condition resolver. > Stage-5 is filtered out by condition resolver. > Moving data to: > hdfs://iop-hadoop-bi.novalocal:8020/apps/hive/warehouse/people/.hive-staging_hive_2017-09-11_13-41-28_757_445847252207124056 > 7-1/-ext-10000 > Loading data to table default.people > Table default.people stats: [numFiles=2, numRows=2, totalSize=22, > rawDataSize=20] > MapReduce Jobs Launched: > Stage-Stage-1: Map: 1 Cumulative CPU: 2.28 sec HDFS Read: 3924 HDFS > Write: 81 SUCCESS > Total MapReduce CPU Time Spent: 2 seconds 280 msec > OK > Time taken: 14.288 seconds > hive> exit; > ================= > > > Jinhui > >