Strange number of maps when using with HBase

Martin Fiala Mon, 14 Jun 2010 09:01:12 -0700

Hello,

I am a newbie to Hive, but I'm already quite familiar with Hadoop/HBase.
I must appreciate the whole project and especially the new integration
with HBase, which is what we really need. :)


So back to the problem, I got Hive running with HBase, it works really
nice, gets data from HBase, computes something and returns results. But
even when I run it on a large table with hundreds of regions, it splits
data only into 2 maps, which means only 2 task trackers running on a 10
node cluster. When I run similar task written in Java+MapReduce and fire
it up, it splits input into hundreds of maps and the computation is
nicely distributed.

Is it some misconfiguration or why the Hive's InputSplit gives me only 2
maps?

Regards,
Martin Fiala

Strange number of maps when using with HBase

Reply via email to