Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by DevarajDas: http://wiki.apache.org/hadoop/PoweredBy ------------------------------------------------------------------------------ * Using Hadoop on EC2 to process documents from a continuous web crawl and distributed training of support vector machines * Using HDFS for large archival data storage + * [http://www.psgtech.edu/department/cse/newcse/index.htm] - PSG Tech, Coimbatore, India + * Multiple alignment of protein sequences helps to determine evolutionary linkages and to predict molecular structures. The dynamic nature of the algorithm coupled with data and compute parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. Parallelism at the sequence and block level reduces the time complexity of MSA problems. Scalable nature of Hadoop makes it apt to solve large scale alignment problems. + * Our cluster size varies from 5 to 10 nodes. Cluster nodes vary from 2950 Quad Core Rack Server, with 2x6MB Cache and 4 x 500 GB SATA Hard Drive to E7200 / E7400 processors with 4 GB RAM and 160 GB HDD. + * [http://www.quantcast.com/ Quantcast] * 3000 cores, 3500TB. 1PB+ processing each day. * Hadoop scheduler with fully custom data path / sorter
