Would the second WAL contain the same contents as the first ?

We already have the code that adds interceptor on the calls to the
namenode#getBlockLocations so that blocks on the same DN as the dead RS are
placed at the end of the priority queue..
See addLocationsOrderInterceptor()
in hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java

This is for faster recovery in case regionserver is deployed on the same
box as the datanode.


On Tue, Apr 15, 2014 at 1:43 PM, Claudiu Soroiu <[email protected]> wrote:

> First of all, thanks for the clarifications.
>
> **how about 300 regions with 3x replication?  Or 1000 regions? This
> is going to be 3000 files. on HDFS. per one RS.**
>
> Now i see that the trade-off is how to reduce the recovery time without
> affecting the overall performance of the cluster.
> Having too many WAL's affects the write performance.
> Basically multiple WAL's might improve the process but the number of WAL's
> should be relatively small.
>
> Would it be feasible to know ahead of time where a region might activate in
> case of a failure and have for each region server a second WAL file
> containing backup edits?
> E.g. If machine B crashes then a region will be assigned to node A,  one to
> node C, etc.
> Also another view would be: Server A will backup a region from Server B if
> crashes, a region from server C, etc. Basically this second WAL will
> contain the data that is needed to fast recover a crashed node.
> This adds additional redundancy and some degree of complexity to the
> solution but ensures data locality in case of a crash and faster recovery.
>
> **What did you do Claudiu to get the time down?**
>
>  Decreased the hdfs block size to 64 megs for now.
>  Enabled settings to avoid hdfs stale nodes.
>  Cluster I tested this was relatively small - 10 computers.
>  I did tuning for zookeeper sessions to keep the heartbeat at 5 seconds for
> the moment, and plan to decrease this value.
>  At this point dfs.heartbeat.interval is set at the default 3 seconds, but
> this I also plan to decrease and perform a more intensive test.
>  (Decreasing the times is based on the experience with our current system
> configured at 1.2 seconds and didn't had any issues even under heavy loads,
> obviously stop the world GC times should be smaller that the heartbeat
> interval)
>  And I remember i did some changes for the reconnect intervals of the
> client to allow him to reconnect to the region as fast as possible.
>  I am in an early stage of experimenting with hbase but there are lot of
> things to test/check...
>
>
>
>
> On Tue, Apr 15, 2014 at 11:03 PM, Vladimir Rodionov
> <[email protected]>wrote:
>
> > *We also had a global HDFS file limit to contend with*
> >
> > Yes, we have been seeing this from time to time in our production
> clusters.
> > Periodic purging of old files helps, but the issue is obvious.
> >
> > -Vladimir Rodionov
> >
> >
> > On Tue, Apr 15, 2014 at 11:58 AM, Stack <[email protected]> wrote:
> >
> > > On Mon, Apr 14, 2014 at 1:47 PM, Claudiu Soroiu <[email protected]>
> > wrote:
> > >
> > > > ....
> > >
> > > After some tunning I managed to
> > > > reduce it to 8 seconds in total and for the moment it fits the needs.
> > > >
> > >
> > > What did you do Claudiu to get the time down?
> > > Thanks,
> > > St.Ack
> > >
> >
>

Reply via email to