Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by TedDunning: http://wiki.apache.org/hadoop/PoweredBy ------------------------------------------------------------------------------ * [http://www.weblab.infosci.cornell.edu/ Cornell University Web Lab] * Generating web graphs on 100 nodes (dual 2.4GHz Xeon Processor, 2 GB RAM, 72GB Hard Drive) + + * [http://www.deepdyve.com Deepdyve] + * Elastic cluster with 5-80 nodes + * We use hadoop to create our indexes of deep web content and to provide a high availability and high bandwidth storage service for index shards for our search cluster. * [http://www.enormo.com/ Enormo] * 4 nodes cluster (32 cores, 1TB). @@ -192, +196 @@ * We use hadoop to process data relating to people on the web * We also involved with Cascading to help simplify how our data flows through various processing stages + * [http://code.google.com/p/redpoll/ Redpoll] + * Hardware: 35 nodes (2*4cpu 10TB disk 16GB RAM each) + * We intend to parallelize some traditional classification, clustering algorithms like Naive Bayes, K-Means, EM so that can deal with large-scale data sets. + * [http://alpha.search.wikia.com Search Wikia] * A project to help develop open source social search tools. We run a 125 node hadoop cluster. @@ -265, +273 @@ * 10 node cluster (Dual-Core AMD Opteron 2210, 4GB RAM, 1TB/node storage) * Run Naive Bayes classifiers in parallel over crawl data to discover event information - * [http://code.google.com/p/redpoll/ Redpoll] - * Hardware: 35 nodes (2*4cpu 10TB disk 16GB RAM each) - * We intent to parallelize some traditional classification, clustering algorithms like Naive Bayes, K-Means, EM so that can deal with large-scale data sets. ''When applicable, please include details about your cluster hardware and size.''
