Duo Zhang created HBASE-18971:
---------------------------------

             Summary: Limit the concurrent opened wal writers when splitting
                 Key: HBASE-18971
                 URL: https://issues.apache.org/jira/browse/HBASE-18971
             Project: HBase
          Issue Type: Improvement
          Components: Recovery, wal
            Reporter: Duo Zhang


A whole cluster restart is very easy to fail under the current architecture if 
there are many regions on a single region server.

On a small cluster, although an recovered edits file is very small, NN will 
reserve a block size for it when opening, so it will easily run out of space.

And on a large cluster, although the max xceiver count is already 4096, it is 
still easy to run out of quota and cause DN to reject our request if there are 
1k+ regions on a single RS as we will write 3 copies for a block.

Under the current architecture we need to carefully choose the 
‘hbase.regionserver.wal.max.splitters’ and 
'hbase.master.executor.serverops.threads' to limit the concurrency of wal 
splitter. But this is only a compromise as it also slows down the fail recovery.

So here we want to limit the concurrent opened wal writers when splitting. It 
may work like a memstore, which buffers the wal entries in memory and when it 
is full we flush some entries out.

Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to