Hi,

I playing with a sample program using Map Reduce (MR).  All I have a text
file(685 MB), and using it to create a HTable. 

The testing environment is, 
1. single node cluster
2. 2 MB RAM 
3. Hadoop and Hbase version, both are 0.19.1

Here is the program attached, 
http://www.nabble.com/file/p24166190/MRTest.java MRTest.java 

and the hadoop-site.xml
http://www.nabble.com/file/p24166190/hadoop-site.xml hadoop-site.xml 

and fair scheduler allocation file
http://www.nabble.com/file/p24166190/mapred_fairseheduler_allocation_file.xml
mapred_fairseheduler_allocation_file.xml 
(I had used the FairScheduler, since the mapred.map.tasks were not getting
applied in the cluster instance, If I use JobQueueTaskScheduler (default),
which always run 2 tasks at a time).

On running the above program with the given configurations, it takes
(13mins, 46sec and 15mins, 3sec respectively - 2 samples) to create the
table.

If the do the same stuffs without MR, it takes 18mins, 04sec. So, the MR
gives me substantial gain. But, I would like to know, if there is better
optimization to improve the performance and also am I doing the right?

TIA,
Ramesh


-- 
View this message in context: 
http://www.nabble.com/Map-Reduce-performance-tp24166190p24166190.html
Sent from the HBase User mailing list archive at Nabble.com.

Reply via email to