Try setting the number of mappers base on your cluster size.

set mapred.map.tasks=XX;

Also, make sure to configure hive to hit multiple zookeepers.

-ray

On Mon, Jun 14, 2010 at 9:00 AM, Martin Fiala <[email protected]> wrote:

> Hello,
>
> I am a newbie to Hive, but I'm already quite familiar with Hadoop/HBase.
> I must appreciate the whole project and especially the new integration
> with HBase, which is what we really need. :)
>
> So back to the problem, I got Hive running with HBase, it works really
> nice, gets data from HBase, computes something and returns results. But
> even when I run it on a large table with hundreds of regions, it splits
> data only into 2 maps, which means only 2 task trackers running on a 10
> node cluster. When I run similar task written in Java+MapReduce and fire
> it up, it splits input into hundreds of maps and the computation is
> nicely distributed.
>
> Is it some misconfiguration or why the Hive's InputSplit gives me only 2
> maps?
>
> Regards,
> Martin Fiala
>

Reply via email to