It makes sense to have as many wals as # of spindles / replication factor per machine. This should be decoupled from the number of regions on a region server. So for a cluster with 12 spindles we should likely have at least 4 wals (12 spindles / 3 replication factor), and need to do experiments to see if going to 8 or some higher number makes sense (new wal uses a disruptor pattern which avoids much contention on individual writes). So with your example, your 1000 regions would get sharded into the 4 wals which would maximize io throughput, disk utilization, and reduce time for recovery in the face of failure.
In the case of an SSD world, it makes more sense to have one wal per node once we have decent HSM support in HDFS. The key win here will be in recovery time -- if any RS goes down we only have to replay a regions edits and not have to split or demux different region's edits. Jon. On Mon, Apr 14, 2014 at 10:37 PM, Vladimir Rodionov <[email protected]>wrote: > Todd, how about 300 regions with 3x replication? Or 1000 regions? This is > going to be 3000 files. on HDFS. per one RS. When I said that it does not > scale, I meant that exactly that. > -- // Jonathan Hsieh (shay) // HBase Tech Lead, Software Engineer, Cloudera // [email protected] // @jmhsieh
