Hi, I playing with a sample program using Map Reduce (MR). All I have a text file(685 MB), and using it to create a HTable.
The testing environment is, 1. single node cluster 2. 2 MB RAM 3. Hadoop and Hbase version, both are 0.19.1 Here is the program attached, http://www.nabble.com/file/p24166190/MRTest.java MRTest.java and the hadoop-site.xml http://www.nabble.com/file/p24166190/hadoop-site.xml hadoop-site.xml and fair scheduler allocation file http://www.nabble.com/file/p24166190/mapred_fairseheduler_allocation_file.xml mapred_fairseheduler_allocation_file.xml (I had used the FairScheduler, since the mapred.map.tasks were not getting applied in the cluster instance, If I use JobQueueTaskScheduler (default), which always run 2 tasks at a time). On running the above program with the given configurations, it takes (13mins, 46sec and 15mins, 3sec respectively - 2 samples) to create the table. If the do the same stuffs without MR, it takes 18mins, 04sec. So, the MR gives me substantial gain. But, I would like to know, if there is better optimization to improve the performance and also am I doing the right? TIA, Ramesh -- View this message in context: http://www.nabble.com/Map-Reduce-performance-tp24166190p24166190.html Sent from the HBase User mailing list archive at Nabble.com.
