[ 
https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017259#comment-13017259
 ] 

stack commented on HBASE-3744:
------------------------------

Thanks for looking into this one Ted.

The issue I ran into today is that the default handler for exceptions thrown 
while running BulkAssign excecutors was this:

{code}
  protected UncaughtExceptionHandler getUncaughtExceptionHandler() {
    return new UncaughtExceptionHandler() {
      @Override
      public void uncaughtException(Thread t, Throwable e) {
        // Abort if exception of any kind.
        server.abort("Uncaught exception in " + t.getName(), e);
      }
    };
  }
{code}

On the one hand, the above was what we decided made sense doing bulk assign on 
startup.

But now bulkassign has been pulled around to do bulkassigning elsewhere.  Thats 
fine.  Its just that we should change the above default in subclasses that are 
not startup bulk assigns; i.e. make a new subclass so we can override the above 
default (What will you call it?  BulkTableAssigner?).  We should do this 
instead of try/catch (try/catch defeats the above mechanism).  I think what we 
want in the case where bulkassign is used on table create is just logging of 
the exception.  The region will have be in the OFFLINE state waiting to be 
moved by the RS to OPENING but it won't happen in the case above, so the 
OFFLINE state will time out and we'll retry the open.  We should add a test to 
prove this is actually what  happens (I could have a go at that if you want -- 
just say).

So, I think this bit incorrect:

{code}
     public void run() {
-      this.assignmentManager.assign(this.regionserver, this.regions);
+      try {
+        this.assignmentManager.assign(this.regionserver, this.regions);
+      } catch (Exception e) {
+        LOG.error("Error assigning " + this.regions.size() + " regions for "
+            + this.regionserver.getHostnamePort(), e);
+      }
     }
{code}

For this:

{code}
+        for (HRegionInfo region : regionList) regionSet.add(region);
{code}

... can you do this instead?

{code}
regionSet.addAll(regionList);
{code}

I like the sync addition and the wait on table regions additions.

Good stuff Ted.

> createTable blocks until all regions are out of transition
> ----------------------------------------------------------
>
>                 Key: HBASE-3744
>                 URL: https://issues.apache.org/jira/browse/HBASE-3744
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.1
>            Reporter: Todd Lipcon
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: 3744.txt, 3744.txt
>
>
> In HBASE-3305, the behavior of createTable was changed and introduced this 
> bug: createTable now blocks until all regions have been assigned, since it 
> uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls 
> assignmentManager.waitUntilNoRegionsInTransition, which waits across all 
> regions, not just the regions of the table that has just been created.
> We saw an issue where one table had a region which was unable to be opened, 
> so it was stuck in RegionsInTransition permanently (every open was failing). 
> Since this was the case, waitUntilDone would always block indefinitely even 
> though the newly created table had been assigned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to