Re: HBase region server failure issues

Kevin O'dell Tue, 15 Apr 2014 09:27:06 -0700

In general I have never seen nor heard of Federated Namespaces in the wild,
so I would be hesitant to go down that path.  But you know for "Science" I
would be interested in seeing how that worked out.  Would we be looking at
32 WALs per region?  At a large cluster with 1000nodes, 100 regions per
node, and a WAL per region(I like easy math):


1000*100*32= 3.2 million files for WALs  This is not ideal, but it is not
horrible if we are using 128MB block sizes etc.

I feel like I am missing something above though.  Thoughts?


On Tue, Apr 15, 2014 at 12:20 PM, Andrew Purtell <[email protected]>wrote:

> # of WALs as roughly spindles / replication factor seems intuitive. Would
> be interesting to benchmark.
>
> As for one WAL per region, the BigTable paper IIRC says they didn't because
> of concerns about the number of seeks in the filesystems underlying GFS and
> because it would reduce the effectiveness of group commit throughput
> optimization. If WALs are backed by SSD certainly the first consideration
> no longer holds. We also had a global HDFS file limit to contend with. I
> know HDFS is incrementally improving the scalabilty of a namespace, but
> this is still an active consideration. (Or we could try partitioning a
> deploy over a federated namespace? Could be "interesting". Has anyone tried
> that? I haven't heard.)
>
>
>
> On Tue, Apr 15, 2014 at 7:11 AM, Jonathan Hsieh <[email protected]> wrote:
>
> > It makes sense to have as many wals as # of spindles / replication factor
> > per machine.  This should be decoupled from the number of regions on a
> > region server.  So for a cluster with 12 spindles we should likely have
> at
> > least 4 wals (12 spindles / 3 replication factor), and need to do
> > experiments to see if going to 8 or some higher number makes sense (new
> wal
> > uses a disruptor pattern which avoids much contention on individual
> > writes).   So with your example, your 1000 regions would get sharded into
> > the 4 wals which would maximize io throughput, disk utilization, and
> reduce
> > time for recovery in the face of failure.
> >
> > In the case of an SSD world, it makes more sense to have one wal per node
> > once we have decent HSM support in HDFS.  The key win here will be in
> > recovery time -- if any RS goes down we only have to replay a regions
> edits
> > and not have to split or demux different region's edits.
> >
> > Jon.
> >
> >
> > On Mon, Apr 14, 2014 at 10:37 PM, Vladimir Rodionov
> > <[email protected]>wrote:
> >
> > > Todd, how about 300 regions with 3x replication?  Or 1000 regions? This
> > is
> > > going to be 3000 files. on HDFS. per one RS. When I said that it does
> not
> > > scale, I meant that exactly that.
> > >
> >
> >
> >
> > --
> > // Jonathan Hsieh (shay)
> > // HBase Tech Lead, Software Engineer, Cloudera
> > // [email protected] // @jmhsieh
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Kevin O'Dell
Systems Engineer, Cloudera

Re: HBase region server failure issues

Reply via email to