HBase can get stuck if updates to META are blocked
--------------------------------------------------
Key: HBASE-2439
URL: https://issues.apache.org/jira/browse/HBASE-2439
Project: Hadoop HBase
Issue Type: Bug
Reporter: Kannan Muthukkaruppan
(We noticed this on a import-style test in a small test cluster.)
If compactions are running slow, and we are doing a lot of region splits, then,
since META has a much smaller hard-coded memstore flush size (16KB), it quickly
accumulates lots of store files. Once this exceeds
"hbase.hstore.blockingStoreFiles", flushes to META become no-ops. This causes
METAs memstore footprint to grow. Once this exceeds
"hbase.hregion.memstore.block.multiplier * 16KB", we block further updates to
META.
In my test setup:
hbase.hregion.memstore.block.multiplier = 4.
and,
hbase.hstore.blockingStoreFiles = 15.
And we saw messages of the form:
{code}
2010-04-09 18:37:39,539 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 23 on 60020' on region .META.,,1:
memstore size 64.2k is >= than blocking 64.0k size
{code}
Now, if around the same time the CompactSplitThread does a compaction and
determines it is going split the region. As part of finishing the split, it
wants to update META about the daughter regions.
It'll end up waiting for the META to become unblocked. The single
CompactSplitThread is now held up, and no further compactions can proceed.
META's compaction request is itself blocked because the compaction queue will
never get cleared.
This essentially creates a deadlock and the region server is able to not
progress any further. Eventually, each region server's CompactSplitThread ends
up in the same state.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira