No, Todd was not specifying some kind of minimum. The point was the more spindles, the better for an I/O parallel architecture like HDFS and BigTable. Have you read the BigTable paper?
- Andy > From: Gibbon, Robert, VF-Group > Subject: RE: Using HBase on other file systems > > Todd thanks for replying. 4x 7200 spindles and no RAID = > approx 360 IOPS to/from the backend storage, minimum and per > node to run an HBase cluster. > > Right? > > cheers > Robert > > -----Original Message----- > From: Todd Lipcon [mailto:t...@cloudera.com] > Sent: Sat 5/15/2010 3:51 AM > To: hbase-user@hadoop.apache.org > Subject: Re: Using HBase on other file systems > > On Fri, May 14, 2010 at 2:15 PM, Gibbon, Robert, VF-Group > < > robert.gib...@vodafone.com> > wrote: > > > Hmm. What level of IOPs does Hbase need in order to > support a reasonably > > responsive level of service? How much latency in > transfer times is > > acceptable before the nodes start to fail? Do you use > asynchronous IO > > queueing? Write-through caching? Prefetching? > > > > > Hi Robert. Have you read the Bigtable paper? It's a good > description of the > general IO architecture of BigTable. You can also read the > original paper on > Log-structured merge tree storage from back in the 90s. > > To answer your questions in brief: > - Typical clusters run on between 4 and 12x 7200RPM SATA > disks. Some people > run on 10k disks to get more random reads per second, but > not necessary > - latency in transfer times is a matter of what your > application needs, not > a matter of what HBase needs. > - no, we do not asynchronously queue reads - AIO support is > lacking in Java > 6 and even in the current previews of Java7 it is a thin > wrapper around > threadpools and synchronous IO APIs. > - HBases uses log-structured storage, which is somewhat the > same as > write-through caching in a way. We never do random-writes > (in fact they're > impossible in HDFS) > > -Todd > > > > > > On Fri, May 14, 2010 at 12:02 PM, Gibbon, Robert, > VF-Group < > > robert.gib...@vodafone.com> > wrote: > > > > > > > > My thinking is around separation of concerns - at > an OU level not just at > > a > > > system integration level. Walrus gives me a > consistent, usable > > abstraction > > > layer to transparently substitute the storage > implementation - for > > example > > > from symmetrix <--> isilon or anything in > between. Walrus is storage > > > subsystem agnostic, so it need not be configured > for inconsistency like > > the > > > Amazon service it emulates. > > > > > > Tight coupling for lock-in is a great commercial > technique often seen > > with > > > suppliers. But it is a bad one. Very bad. > > > > > > > However, reasonably tight coupling between a database > (HBase) and its > > storage layer (HDFS) is IMHO absolutely necessary to > achieve a certain > > level > > of correctness and performance. In HBase's case we use > the Hadoop > > FileSystem > > interface, so in theory it will work on anyone who has > implemented said > > interface, but I wouldn't run a production instance on > anything but HDFS. > > > > It's worth noting that most commercial databases > operate on direct block > > devices rather than on top of filesystems, so that > they don't have to deal > > with varying semantics/performance between > ext3,ext4,xfs,ufs, myriad other > > single-node filesystems that exist. > > > > -Todd > > > > > > > > > > > > > -----Original Message----- > > > From: Andrew Purtell [mailto:apurt...@apache.org] > > > Sent: Thu 5/13/2010 11:54 PM > > > To: hbase-user@hadoop.apache.org > > > Subject: RE: Using HBase on other file systems > > > > > > You really want to run HBase backed by > Eucalyptus' Walrus? What do you > > have > > > behind that? > > > > > > > From: Gibbon, Robert, VF-Group > > > > Subject: RE: Using HBase on other file > systems > > > [...] > > > > NB. I checked out running HBase over Walrus > (an AWS S3 > > > > clone): bork - you want me to file a Jira on > that? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Todd Lipcon > > Software Engineer, Cloudera > > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > >