Hello,
Thank you for your answers, this solves the issue.
I have set mapred.max.split.size to 1024000000 in hive-site.xml and jobs are using appropriate number of mappers.

I have played a little with different configurations and CombineHiveInputFormat gives better performance than HiveInputFormat in my case.

Thanks again.
--
Wojciech Langiewicz

On 29.07.2011 05:43, Carl Steinbach wrote:
Hi Wojciech,

Vaibhav is correct. There's a configuration problem in the copy of
hive-default.xml that ships with CDH3u1 which sets
hive.input.format=CombineHiveInputFormat, but leaves mapred.max.split.size
undefined. You can fix this problem by setting mapred.max.split.size in
hive-default.xml to some reasonable value (it currently defaults
to 256000000 on trunk).

Sorry for the inconvenience.

Carl

On Thu, Jul 28, 2011 at 11:28 AM, Aggarwal, Vaibhav<vagg...@amazon.com>wrote:

If you are using CombineHiveInputFormat it might be the case that all files
are being combined into one large split and hence 1 mapper gets created.**
**

** **

If that is the case you can set the max split size in hive-default.xml
config file to create more splits and hence more map tasks:****

** **

<property>****

   <name>mapred.max.split.size</name>****

   <value>  134217728</value>****

   <description>The maximum size chunk that map input should be split****

   into.</description>****

</property>****

****

Thanks****

Vaibhav****

** **

*From:* Edward Capriolo [mailto:edlinuxg...@gmail.com]
*Sent:* Thursday, July 28, 2011 7:10 AM
*To:* user@hive.apache.org
*Subject:* Re: Hive 0.7 using only one mapper****

** **

** **

On Thu, Jul 28, 2011 at 9:23 AM, Wojciech Langiewicz<
wlangiew...@gmail.com>  wrote:****

Hello,
I'm having isssue running Hive jobs after updating from Hive 0.5 to Hive
0.7 (from CDHb4 to CDHu1).

No matter what query I'm running Hive is always using one mapper.
I have tried different queries with various sizes of input and ones with
many reducers or no reducers.

For version 0.5 everything worked correctly.
I'm attaching my hive-site.xml: https://gist.github.com/1111531
I have tested also jobs with Pig, and those jobs use multiple mappers - so
I guess this is a Hive issue.

Thank you for all your help.

--
Wojciech Langiewicz****


You should also check that your hive-default.xml and other conf/ files is
up to 0.7.X. Having older versions of that file can lead to problems.

Edward****



Reply via email to