Re: only one mapper

Sanjay Subramanian Wed, 21 Aug 2013 12:15:09 -0700

Hi

Try this setting in your hive query


SET mapreduce.input.fileinputformat.split.maxsize=<some bytes>;

If u set this value "low" then the MR job will use this size to split the input 
LZO files and u will get multiple mappers (and make sure the input LZO files 
are indexed I.e. .LZO.INDEX files are created)

sanjay


From: Edward Capriolo <edlinuxg...@gmail.com<mailto:edlinuxg...@gmail.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Wednesday, August 21, 2013 10:43 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: only one mapper

LZO files are only splittable if you index them. Sequence files compresses with 
LZO are splittable without being indexed.

Snappy + SequenceFile is a better option then LZO.


On Wed, Aug 21, 2013 at 1:39 PM, Igor Tatarinov 
<i...@decide.com<mailto:i...@decide.com>> wrote:
LZO files are combinable so check your max split setting.
http://mail-archives.apache.org/mod_mbox/hive-user/201107.mbox/%3c4e328964.7000...@gmail.com%3E

igor
decide.com<http://decide.com>



On Wed, Aug 21, 2013 at 2:17 AM, 闫昆 
<yankunhad...@gmail.com<mailto:yankunhad...@gmail.com>> wrote:
hi all when i use hive
hive job make only one mapper actually my file split 18 block my block size is 
128MB and data size 2GB
i use lzo compression and create file.lzo and make index file.lzo.index
i use hive 0.10.0

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Cannot run job locally: Input Size (= 2304560827) is larger than 
hive.exec.mode.local.auto.inputbytes.max (= 134217728)
Starting Job = job_1377071515613_0003, Tracking URL = 
http://hydra0001:8088/proxy/application_1377071515613_0003/
Kill Command = /opt/module/hadoop-2.0.0-cdh4.3.0/bin/hadoop job  -kill 
job_1377071515613_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2013-08-21 16:44:30,237 Stage-1 map = 0%,  reduce = 0%
2013-08-21 16:44:40,495 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 6.81 sec
2013-08-21 16:44:41,710 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 6.81 sec
2013-08-21 16:44:42,919 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 6.81 sec
2013-08-21 16:44:44,117 Stage-1 map = 3%,  reduce = 0%, Cumulative CPU 9.95 sec
2013-08-21 16:44:45,333 Stage-1 map = 3%,  reduce = 0%, Cumulative CPU 9.95 sec
2013-08-21 16:44:46,530 Stage-1 map = 5%,  reduce = 0%, Cumulative CPU 13.0 sec

--

In the Hadoop world, I am just a novice, explore the entire Hadoop ecosystem, I 
hope one day I can contribute their own code

YanBit
yankunhad...@gmail.com<mailto:yankunhad...@gmail.com>




CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.

Re: only one mapper

Reply via email to