+1
On 1/22/10 3:24 AM, "Andrew Purtell" <apurt...@apache.org> wrote: > +1 > > This makes an observed big difference for stability of small/test clusters. > > I second Ryan's specific point about stability of small clusters being > important. > > - Andy > > > On Thu Jan 21st, 2010 2:46 PM PST Ryan Rawson wrote: > >> Scaling _down_ is a continual problem for us, and this is one of the >> prime factors. It puts a bad taste in the mouth of new people who then >> run away from HBase and HDFS since it is "unreliable and unstable". It >> is perfectly within scope to support a cluster of about 5-6 machines >> which can have an aggregate capacity of 24TB (which is a fair amount), >> and people expect to start small, prove the concept/technology then >> move up. >> >> I am also +1 >> >> On Thu, Jan 21, 2010 at 2:36 PM, Stack <st...@duboce.net> wrote: >>> I'd like to propose a new vote on having hdfs-630 committed to 0.21. >>> The first vote on this topic, initiated 12/14/2009, was sunk by Tsz Wo >>> (Nicholas), Sze suggested improvements. Those suggestions have since >>> been folded into a new version of the hdfs-630 patch. Its this new >>> version of the patch -- 0001-Fix-HDFS-630-0.21-svn-2.patch -- that I'd >>> like us to vote on. For background on why we -- the hbase community >>> -- think hdfs-630 important, see the notes below from the original >>> call-to-vote. >>> >>> I'm obviously +1. >>> >>> Thanks for you consideration, >>> St.Ack >>> >>> P.S. Regards TRUNK, after chatting with Nicholas, TRUNK was cleaned of >>> the previous versions of hdfs-630 and we'll likely apply >>> 0001-Fix-HDFS-630-trunk-svn-4.patch, a version of >>> 0001-Fix-HDFS-630-0.21-svn-2.patch that works for TRUNK that includes >>> the Nicholas suggestions. >>> >>> >>> On Mon, Dec 14, 2009 at 9:56 PM, stack <st...@duboce.net> wrote: >>>> I'd like to propose a vote on having hdfs-630 committed to 0.21 (Its >>>> already >>>> been committed to TRUNK). >>>> >>>> hdfs-630 adds having the dfsclient pass the namenode the name of datanodes >>>> its determined dead because it got a failed connection when it tried to >>>> contact it, etc. This is useful in the interval between datanode dying and >>>> namenode timing out its lease. Without this fix, the namenode can often >>>> give out the dead datanode as a host for a block. If the cluster is small, >>>> less than 5 or 6 nodes, then its very likely namenode will give out the >>>> dead >>>> datanode as a block host. >>>> >>>> Small clusters are common in hbase, especially when folks are starting out >>>> or evaluating hbase. They'll start with three or four nodes carrying both >>>> datanodes+hbase regionservers. They'll experiment killing one of the >>>> slaves >>>> -- datanodes and regionserver -- and watch what happens. What follows is a >>>> struggling dfsclient trying to create replicas where one of the datanodes >>>> passed us by the namenode is dead. DFSClient will fail and then go back >>>> to >>>> the namenode again, etc. (See >>>> https://issues.apache.org/jira/browse/HBASE-1876 for more detailed >>>> blow-by-blow). HBase operation will be held up during this time and >>>> eventually a regionserver will shut itself down to protect itself against >>>> dataloss if we can't successfully write HDFS. >>>> >>>> Thanks all, >>>> St.Ack >>> > > > > >