Hi, We can use third party built in classes from NLP, Text Mining libraries, and others in java Map Reduce or We can use Python plus Hadoop streaming for writing more parallel complex code.
This link has code for computing Pearson correlation: https://github.com/malli3131/HadoopTutorial/tree/master/Mapreduce/Programs/Pearson Thanks On Sat, Nov 9, 2013 at 12:40 AM, Tony Wang <ivyt...@gmail.com> wrote: > So far, I only know that Hadoop can do counting. I am wondering if there's > any way to make calls to an external program for more complex processing > than counting in hadoop. Is there any example? thanks > > tony > -- Thanks and Regards Nagamallikarjuna