[Hadoop Wiki] Update of "Hbase/PoweredBy" by BradfordSt ephens

Apache Wiki Wed, 07 Oct 2009 13:59:50 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "Hbase/PoweredBy" page has been changed by BradfordStephens:
http://wiki.apache.org/hadoop/Hbase/PoweredBy?action=diff&rev1=33&rev2=34

  [[http://www.adobe.com|Adobe]] - We currently have about 30 nodes running 
HDFS, Hadoop and HBase  in clusters ranging from 5 to 14 nodes on both 
production and development. We plan a deployment on an 80 nodes cluster. We are 
using HBase in several areas from social services to structured data and 
processing for internal use. We constantly write data to HBase and run 
mapreduce jobs to process then store it back to HBase or external systems. Our 
production cluster has been running since Oct 2008.
  
  [[http://www.flurry.com|Flurry]] provides mobile application analytics.  We 
use HBase and Hadoop for all of our analytics processing, and serve all of our 
live requests directly out of HBase on our production cluster with billions of 
rows over several tables.
+ 
+ [[http://www.drawntoscaleconsulting.com|Drawn to Scale Consulting]] consults 
on HBase, Hadoop, Distributed Search, and Scalable architectures. 
  
  [[http://gumgum.com|GumGum]] is an analytics and monetization platform for 
online content. We've developed usage-based licensing models that make the best 
content in the world accessible to publishers of all sizes.  We use HBase 
0.20.0 on a 4-node Amazon EC2 cluster to record visits to advertisers in our ad 
network. Our production cluster has been running since July 2009.
  
@@ -30, +32 @@

  
  [[http://www.videosurf.com/|VideoSurf]] - "The video search engine that has 
taught computers to see". We're using Hbase to persist various large graphs of 
data and other statistics. Hbase was a real win for us because it let us store 
substantially larger datasets without the need for manually partitioning the 
data and it's column-oriented nature allowed us to create schemas that were 
substantially more efficient for storing and retrieving data.
  
+ [[http://www.visibletechnologies.com/|Visible Technologies]] - We use Hadoop, 
HBase, Katta, and more to collect, parse, store, and search hundreds of 
millions of Social Media content. We get incredibly fast throughput and very 
low latency on commodity hardware. HBase enables our business to exist. 
+ 
  [[http://www.worldlingo.com/|WorldLingo]] - The !WorldLingo Multilingual 
Archive. We use HBase to store millions of documents that we scan using 
Map/Reduce jobs to machine translate them into all or selected target languages 
from our set of available machine translation languages. We currently store 12 
million documents but plan to eventually reach the 450 million mark. HBase 
allows us to scale out as we need to grow our storage capacities. Combined with 
Hadoop to keep the data replicated and therefore fail-safe we have the backbone 
our service can rely on now and in the future. !WorldLingo is using HBase since 
December 2007 and is along with a few others one of the longest running HBase 
installation. Currently we are running the latest HBase 0.20 and serving 
directly from it: 
[[http://www.worldlingo.com/ma/enwiki/en/HBase|MultilingualArchive]].
  
  [[http://www.yahoo.com/|Yahoo!]] uses HBase to store document fingerprint for 
detecting near-duplications. We have a cluster of few nodes that runs HDFS, 
mapreduce, and HBase. The table contains millions of rows. We use this for 
querying duplicated documents with realtime traffic.

[Hadoop Wiki] Update of "Hbase/PoweredBy" by BradfordSt ephens

Reply via email to