Some projects sacrifice stability and manageability for performance (see, e.g., http://gluster.org/pipermail/gluster-users/2009-October/003193.html).
On Wed, May 12, 2010 at 11:15 AM, Edward Capriolo <edlinuxg...@gmail.com>wrote: > On Wed, May 12, 2010 at 1:30 PM, Andrew Purtell <apurt...@apache.org> > wrote: > > > Before recommending Gluster I suggest you set up a test cluster and then > > randomly kill bricks. > > > > Also as pointed out in another mail, you'll want to colocate TaskTrackers > > on Gluster bricks to get I/O locality, yet there is no way for Gluster to > > export stripe locations back to Hadoop. > > > > It seems a poor choice. > > > > - Andy > > > > > From: Edward Capriolo > > > Subject: Re: Using HBase on other file systems > > > To: "hbase-user@hadoop.apache.org" <hbase-user@hadoop.apache.org> > > > Date: Wednesday, May 12, 2010, 6:38 AM > > > On Tuesday, May 11, 2010, Jeff > > > Hammerbacher <ham...@cloudera.com> > > > wrote: > > > > Hey Edward, > > > > > > > > I do think that if you compare GoogleFS to HDFS, GFS > > > looks more full > > > >> featured. > > > >> > > > > > > > > What features are you missing? Multi-writer append was > > > explicitly called out > > > > by Sean Quinlan as a bad idea, and rolled back. From > > > internal conversations > > > > with Google engineers, erasure coding of blocks > > > suffered a similar fate. > > > > Native client access would certainly be nice, but FUSE > > > gets you most of the > > > > way there. Scalability/availability of the NN, RPC > > > QoS, alternative block > > > > placement strategies are second-order features which > > > didn't exist in GFS > > > > until later in its lifecycle of development as well. > > > HDFS is following a > > > > similar path and has JIRA tickets with active > > > discussions. I'd love to hear > > > > your feature requests, and I'll be sure to translate > > > them into JIRA tickets. > > > > > > > > I do believe my logic is reasonable. HBase has a lot > > > of code designed around > > > >> HDFS. We know these tickets that get cited all > > > the time, for better random > > > >> reads, or for sync() support. HBase gets the > > > benefits of HDFS and has to > > > >> deal with its drawbacks. Other key value stores > > > handle storage directly. > > > >> > > > > > > > > Sync() works and will be in the next release, and its > > > absence was simply a > > > > result of the youth of the system. Now that that > > > limitation has been > > > > removed, please point to another place in the code > > > where using HDFS rather > > > > than the local file system is forcing HBase to make > > > compromises. Your > > > > initial attempts on this front (caching, HFile, > > > compactions) were, I hope, > > > > debunked by my previous email. It's also worth noting > > > that Cassandra does > > > > all three, despite managing its own storage. > > > > > > > > I'm trying to learn from this exchange and always > > > enjoy understanding new > > > > systems. Here's what I have so far from your > > > arguments: > > > > 1) HBase inherits both the advantages and > > > disadvantages of HDFS. I clearly > > > > agree on the general point; I'm pressing you to name > > > some specific > > > > disadvantages, in hopes of helping prioritize our > > > development of HDFS. So > > > > far, you've named things which are either a) not > > > actually disadvantages b) > > > > no longer true. If you can come up with the > > > disadvantages, we'll certainly > > > > take them into account. I've certainly got a number of > > > them on our roadmap. > > > > 2) If you don't want to use HDFS, you won't want to > > > use HBase. Also > > > > certainly true, but I'm not sure there's not much to > > > learn from this > > > > assertion. I'd once again ask: why would you not want > > > to use HDFS, and what > > > > is your choice in its stead? > > > > > > > > Thanks, > > > > Jeff > > > > > > > > > > Jeff, > > > > > > Let me first mention that you have mentioned some thing as > > > fixed, that > > > are only fixed in trunk. I consider trunk futureware and I > > > do not like > > > to have tempral conversations. Even when trunk becomes > > > current there > > > is no guarentee that the entire problem is solved. After > > > all appends > > > were fixed in .19 or not , or again? > > > > > > I rescanned the gfs white paper to support my argument that > > > hdfs is > > > stripped down. Found > > > Writes at offset ARE supported > > > Checkpoints > > > Application level checkpoints > > > Snapshot > > > Shadow read only master > > > > > > hdfs chose features it wanted and ignored others that is > > > why I called > > > it a pure map reduce implementation. > > > > > > My main point, is that hbase by nature needs high speed > > > random read > > > and random write. Hdfs by nature is bad at these things. If > > > you can > > > not keep a high cache hit rate via large block cache via > > > ram hbase is > > > going to slam hdfs doing large block reads for small parts > > > of files. > > > > > > So you ask. Me what I would use instead. I do not think > > > there is a > > > viable alternative in the 100 tb and up range but I do > > > think for > > > people in the 20 tb range somethink like gluster that is > > > very > > > performance focused might deliver amazing results in some > > > applications. > > > > > > > > > > > > > > I did not recommend anything > > "people in the 20 tb range somethink like gluster that is very > performance focused might deliver amazing results in some > applications." > > I used words like "something. like. might." > > It may just be an interesting avenue of research. > > And since you mentioned > > "also as pointed out in another mail, you'll want to colocate TaskTrackers > on Gluster bricks to get I/O locality, yet there is no way for Gluster to > export stripe locations back to Hadoop." > > 1) I am sure if someone was so included they could find a way to export > that > information from Gluster. > > 2) I think you meant DataNode not TaskTracker. In any case, I remember > reading on list that a RegionServer is not guarenteed to be colocated with > a > datanode, especially after a restart. Someone was going to open a ticket > for > it. >