[jira] [Comment Edited] (HBASE-7871) HBase can be stuck in the shutdown

Anoop Sam John (JIRA) Fri, 22 Mar 2013 05:43:23 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13610197#comment-13610197
 ]


Anoop Sam John edited comment on HBASE-7871 at 3/22/13 12:43 PM:
-----------------------------------------------------------------

I think I got the issue with Region close.
Here in test multiple regions getting closed concurrently.
All the MetricsRegionSourceImpl objects(One per region) share same 
MetricsRegionAggregateSourceImpl instance
Refer code in MetricsRegionServerSourceFactoryImpl
{code}
private synchronized MetricsRegionAggregateSourceImpl getAggregate() {
    if (FactoryStorage.INSTANCE.aggImpl == null) {
      FactoryStorage.INSTANCE.aggImpl = new MetricsRegionAggregateSourceImpl();
    }
    return FactoryStorage.INSTANCE.aggImpl;
  }
@Override
  public MetricsRegionSource createRegion(MetricsRegionWrapper wrapper) {
    return new MetricsRegionSourceImpl(wrapper, getAggregate());
  }
{code}
So concurrent calls for MetricsRegionAggregateSourceImpl#deregister()
TreeSet<MetricsRegionSourceImpl> regionSources being not thread safe can create 
issue.

>From TreeSet's javadoc
{quote}
Note that this implementation is not synchronized. If multiple threads access a 
map concurrently, and at least one of the threads modifies the map 
structurally, it must be synchronized externally. (A structural modification is 
any operation that adds or deletes one or more mappings; merely changing the 
value associated with an existing key is not a structural modification.) 
{quote}
Concurrent structural modifications on non thread safe maps can cause endless 
loops. Am I correct here? I have seen some issues with Maps like this in the 
past. With HashMap I think. Pls correct me if I am wrong
                
      was (Author: anoopsamjohn):
    I think I got the issue with Region close.
Here in test multiple regions getting closed concurrently.
All the MetricsRegionSourceImpl objects(One per region) share same 
MetricsRegionAggregateSourceImpl instance
Refer code in MetricsRegionServerSourceFactoryImpl
{code}
private synchronized MetricsRegionAggregateSourceImpl getAggregate() {
    if (FactoryStorage.INSTANCE.aggImpl == null) {
      FactoryStorage.INSTANCE.aggImpl = new MetricsRegionAggregateSourceImpl();
    }
    return FactoryStorage.INSTANCE.aggImpl;
  }
@Override
  public MetricsRegionSource createRegion(MetricsRegionWrapper wrapper) {
    return new MetricsRegionSourceImpl(wrapper, getAggregate());
  }
{code}
So concurrent calls for MetricsRegionAggregateSourceImpl#deregister()
TreeSet<MetricsRegionSourceImpl> regionSources being not thread safe can create 
issue.

>From TreeSet's javadoc
{quote}
Note that this implementation is not synchronized. If multiple threads access a 
map concurrently, and at least one of the threads modifies the map 
structurally, it must be synchronized externally. (A structural modification is 
any operation that adds or deletes one or more mappings; merely changing the 
value associated with an existing key is not a structural modification.) 
{quote}
Concurrent structural modifications on non thread safe maps can cause endless 
loops.
                  
> HBase can be stuck in the shutdown
> ----------------------------------
>
>                 Key: HBASE-7871
>                 URL: https://issues.apache.org/jira/browse/HBASE-7871
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.96.0
>            Reporter: Nicolas Liochon
>         Attachments: s1.txt, TestStartStop.java
>
>
> The attached test fails ~1% of the the time on 0.96. It seems it does not 
> fail on 0.94.5. It's simple: a table creation and some puts.
> I attach the stack. Logs says nothing it seems.
> The suspicious part is:
> {noformat}
> "RS_CLOSE_REGION-localhost,57575,1361197489166-2" prio=10 
> tid=0x00007fb0c8775800 nid=0x61ac runnable [0x00007fb09f272000]
>    java.lang.Thread.State: RUNNABLE
>         at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
>         at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
>         at java.util.TreeMap.remove(TreeMap.java:585)
>         at java.util.TreeSet.remove(TreeSet.java:259)
>         at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
>         at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
>         at 
> org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
>         - locked <0x00000006944e2558> (a java.lang.Object)
>         at 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
>         at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HBASE-7871) HBase can be stuck in the shutdown

Reply via email to