Hi Ramesh I'm not sure it is really meaningful to try and draw conclusions about performance running on only node as you don't gain any benefits of parallelisation. You might be better trying with a small cluster of say 4 nodes in Amazon EC2, and then trying the same with say 8 nodes and trying to draw some conclusions about increased cluster size yielding better performance, which is presumably the proof you are really looking for - e.g. proving that you can grow in data volume and performance with increased hardware.
I think the MR will work much better with more nodes as you have more clients doing inserts in parallel onto HBase so will increase rapidly as you scale out. Just my 2 cents... Tim On Tue, Jun 23, 2009 at 3:38 PM, peterramesh<[email protected]> wrote: > > Hi, > > I playing with a sample program using Map Reduce (MR). All I have a text > file(685 MB), and using it to create a HTable. > > The testing environment is, > 1. single node cluster > 2. 2 MB RAM > 3. Hadoop and Hbase version, both are 0.19.1 > > Here is the program attached, > http://www.nabble.com/file/p24166190/MRTest.java MRTest.java > > and the hadoop-site.xml > http://www.nabble.com/file/p24166190/hadoop-site.xml hadoop-site.xml > > and fair scheduler allocation file > http://www.nabble.com/file/p24166190/mapred_fairseheduler_allocation_file.xml > mapred_fairseheduler_allocation_file.xml > (I had used the FairScheduler, since the mapred.map.tasks were not getting > applied in the cluster instance, If I use JobQueueTaskScheduler (default), > which always run 2 tasks at a time). > > On running the above program with the given configurations, it takes > (13mins, 46sec and 15mins, 3sec respectively - 2 samples) to create the > table. > > If the do the same stuffs without MR, it takes 18mins, 04sec. So, the MR > gives me substantial gain. But, I would like to know, if there is better > optimization to improve the performance and also am I doing the right? > > TIA, > Ramesh > > > -- > View this message in context: > http://www.nabble.com/Map-Reduce-performance-tp24166190p24166190.html > Sent from the HBase User mailing list archive at Nabble.com. > >
