Re: How to make call to an external program in Hadoop?

2013-11-13 Thread inelu nagamallikarjuna
Hi, We can use third party built in classes from NLP, Text Mining libraries, and others in java Map Reduce or We can use Python plus Hadoop streaming for writing more parallel complex code. This link has code for computing Pearson correlation:

Re: Unable to perform terasort for 50GB of data

2013-11-07 Thread inelu nagamallikarjuna
Hai, Check the individual data nodes usage: Hadoop dfsadmin -report And moreover override the config parameter mapred.local.dir to store intermediate data in some path rather than /tmp directory and don't use single reducer, increase no of reducers and use totalorderpartitioner Thanks

Re: Hadoop Scheduling Algorithm

2013-04-21 Thread inelu nagamallikarjuna
Hi, Addition to Sandy, there is one more scheduler called HOD (Hadoop on Demand). Please go through the following links to get more details on schedulers. HOD - http://hadoop.apache.org/docs/r1.1.2/hod_scheduler.html Fair - http://hadoop.apache.org/docs/r1.1.2/fair_scheduler.html Capacity -