Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The following page has been changed by GaryHelmling:
http://wiki.apache.org/hadoop/Hbase/PoweredBy

------------------------------------------------------------------------------
  [http://gumgum.com GumGum] is an analytics and monetization platform for 
online content. We've developed usage-based licensing models that make the best 
content in the world accessible to publishers of all sizes.  We use HBase 
0.20.0 on a 4-node Amazon EC2 cluster to record visits to advertisers in our ad 
network. Our production cluster has been running since July 2009.
  
  [http://www.mahalo.com Mahalo], "...the world's first human-powered search 
engine". All the markup that powers the wiki is stored in HBase. It's been in 
use for a few months now. !MediaWiki - the same software that power Wikipedia - 
has version/revision control. Mahalo's in-house editors produce a lot of 
revisions per day, which was not working well in a RDBMS. An hbase-based 
solution for this was built and tested, and the data migrated out of MySQL and 
into HBase. Right now it's at something like 6 million items in HBase. The 
upload tool runs every hour from a shell script to back up that data, and on 6 
nodes takes about 5-10 minutes to run - and does not slow down production at 
all. 
+ 
+ [http://www.meetup.com Meetup] is on a mission to help the world’s people 
self-organize into local groups.  We use Hadoop and HBase to power a site-wide, 
real-time activity feed system for all of our members and groups.  Group 
activity is written directly to HBase, and indexed per member, with the 
member's custom feed served directly from HBase for incoming requests.  We're 
running HBase 0.20.0 on a 11 node cluster.
  
  [http://www.openplaces.org Openplaces] is a search engine for travel that 
uses HBase to store terabytes of web pages and travel-related entity records 
(countries, cities, hotels, etc.). We have dozens of MapReduce jobs that crunch 
data on a daily basis.  We use a 20-node cluster for development, a 40-node 
cluster for offline production processing and an EC2 cluster for the live web 
site.
  

Reply via email to