ah, so simple :), that works thanx
m. Dne 14.6.2010 18:53, Ray Duong napsal(a): > Try setting the number of mappers base on your cluster size. > > set mapred.map.tasks=XX; > > Also, make sure to configure hive to hit multiple zookeepers. > > -ray > > On Mon, Jun 14, 2010 at 9:00 AM, Martin Fiala <[email protected] > <mailto:[email protected]>> wrote: > > Hello, > > I am a newbie to Hive, but I'm already quite familiar with > Hadoop/HBase. > I must appreciate the whole project and especially the new integration > with HBase, which is what we really need. :) > > So back to the problem, I got Hive running with HBase, it works really > nice, gets data from HBase, computes something and returns > results. But > even when I run it on a large table with hundreds of regions, it > splits > data only into 2 maps, which means only 2 task trackers running on > a 10 > node cluster. When I run similar task written in Java+MapReduce > and fire > it up, it splits input into hundreds of maps and the computation is > nicely distributed. > > Is it some misconfiguration or why the Hive's InputSplit gives me > only 2 > maps? > > Regards, > Martin Fiala > >
