[Hadoop Wiki] Trivial Update of "Hbase/PoweredBy" by stack

Apache Wiki Mon, 11 May 2009 14:53:04 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/PoweredBy

The comment on the change is:
Removed wikia -- no longer around.

------------------------------------------------------------------------------
  
  [http://www.videosurf.com/ VideoSurf] - "The video search engine that has 
taught computers to see". We're using Hbase to persist various large graphs of 
data and other statistics. Hbase was a real win for us because it let us store 
substantially larger datasets without the need for manually partitioning the 
data and it's column-oriented nature allowed us to create schemas that were 
substantially more efficient for storing and retrieving data.
  
- [http://www.wikia.com/wiki/Wikia Wikia] hosts its user and keyword databases 
on a cluster of 7 machines.
- 
  [http://www.worldlingo.com/ WorldLingo] - The !WorldLingo Multilingual 
Archive. We use HBase to store millions of documents that we scan using 
Map/Reduce jobs to machine translate them into all or selected target languages 
from our set available machine translation languages. We currently store 12 
million documents but plan to eventually reach the 450 million mark. HBase 
allows us to scale out as we need to grow our storage capacities. Combined with 
Hadoop to keep the data replicated and therefore fail-safe we have the backbone 
our service can rely on now and in the future. 
  
  [http://www.yahoo.com/ Yahoo!] uses HBase to store document fingerprint for 
detecting near-duplications. We have a cluster of few nodes that runs HDFS, 
mapreduce, and HBase. The table contains millions of rows. We use this for 
querying duplicated documents with realtime traffic.

[Hadoop Wiki] Trivial Update of "Hbase/PoweredBy" by stack

Reply via email to