[jira] Commented: (HBASE-2439) HBase can get stuck if updates to META are blocked

Kannan Muthukkaruppan (JIRA) Tue, 13 Apr 2010 16:12:18 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856681#action_12856681
 ]


Kannan Muthukkaruppan commented on HBASE-2439:
----------------------------------------------

#1. >>> Was done in trunk.

In the branch, the HTableDescriptor constructor for META & ROOT does a 
"setMemStoreFlushSize(16 * 1024)" and I don't see that call in trunk.

{code}
  /**
   * Private constructor used internally creating table descriptors for 
   * catalog tables: e.g. .META. and -ROOT-.
   */
protected HTableDescriptor(final byte [] name, HColumnDescriptor[] families) {
    this.name = name.clone();
    this.nameAsString = Bytes.toString(this.name);
    setMetaFlags(name);
    for(HColumnDescriptor descriptor : families) {
      this.families.put(descriptor.getName(), descriptor);
    }
    setMemStoreFlushSize(16 * 1024);
  }
{code}

So is my understanding correct that in trunk, ROOT & META simply use the 
default (64MB) memstore flush size? Is that the change we want for branch?

#2. I'll also make the changes to never block updates for meta region.  I plan 
to do so by removing both of the following restrictions for meta regions:

a) blockingStoreFiles limit check in MemStoreFlusher.java:flushRegion():

{code}
  else if (isTooManyStoreFiles(region)) {
      LOG.warn("Region " + region.getRegionNameAsString() + " has too many " +
          "store files, putting it back at the end of the flush queue.");
      server.compactSplitThread.compactionRequested(region, getName());
      // If there's only this item in the queue or they are all in this
      // situation, we will loop at lot. Sleep a bit.
      try {
        Thread.sleep(1000);
      } catch (InterruptedException e) { } // just continue
      flushQueue.add(region);
      // Tell a lie, it's not flushed but it's ok
      return true;
    }
{code}

b)  memstoresize violation restriction in HRegion.java:checkResources().

{code}
 private void checkResources() {
    boolean blocked = false;
    while (this.memstoreSize.get() > this.blockingMemStoreSize) {
      requestFlush();
      if (!blocked) {
        LOG.info("Blocking updates for '" + Thread.currentThread().getName() +
{code}

> HBase can get stuck if updates to META are blocked
> --------------------------------------------------
>
>                 Key: HBASE-2439
>                 URL: https://issues.apache.org/jira/browse/HBASE-2439
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>
> (We noticed this on a import-style test in a small test cluster.)
> If compactions are running slow, and we are doing a lot of region splits, 
> then, since META has a much smaller hard-coded memstore flush size (16KB), it 
> quickly accumulates lots of store files. Once this exceeds 
> "hbase.hstore.blockingStoreFiles", flushes to META become no-ops. This causes 
> METAs memstore footprint to grow. Once this exceeds 
> "hbase.hregion.memstore.block.multiplier * 16KB", we block further updates to 
> META.
> In my test setup:
>   hbase.hregion.memstore.block.multiplier = 4.
> and,
>   hbase.hstore.blockingStoreFiles = 15.
> And we saw messages of the form:
> {code}
> 2010-04-09 18:37:39,539 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Blocking updates for 'IPC Server handler 23 on 60020' on region .META.,,1: 
> memstore size 64.2k is >= than blocking 64.0k size
> {code}
> Now, if around the same time the CompactSplitThread does a compaction and 
> determines it is going split the region. As part of finishing the split, it 
> wants to update META about the daughter regions. 
> It'll end up waiting for the META to become unblocked. The single 
> CompactSplitThread is now held up, and no further compactions can proceed.  
> META's compaction request is itself blocked because the compaction queue will 
> never get cleared.
> This essentially creates a deadlock and the region server is able to not 
> progress any further. Eventually, each region server's CompactSplitThread 
> ends up in the same state.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-2439) HBase can get stuck if updates to META are blocked

Reply via email to