[
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13610197#comment-13610197
]
Anoop Sam John edited comment on HBASE-7871 at 3/22/13 12:43 PM:
-----------------------------------------------------------------
I think I got the issue with Region close.
Here in test multiple regions getting closed concurrently.
All the MetricsRegionSourceImpl objects(One per region) share same
MetricsRegionAggregateSourceImpl instance
Refer code in MetricsRegionServerSourceFactoryImpl
{code}
private synchronized MetricsRegionAggregateSourceImpl getAggregate() {
if (FactoryStorage.INSTANCE.aggImpl == null) {
FactoryStorage.INSTANCE.aggImpl = new MetricsRegionAggregateSourceImpl();
}
return FactoryStorage.INSTANCE.aggImpl;
}
@Override
public MetricsRegionSource createRegion(MetricsRegionWrapper wrapper) {
return new MetricsRegionSourceImpl(wrapper, getAggregate());
}
{code}
So concurrent calls for MetricsRegionAggregateSourceImpl#deregister()
TreeSet<MetricsRegionSourceImpl> regionSources being not thread safe can create
issue.
>From TreeSet's javadoc
{quote}
Note that this implementation is not synchronized. If multiple threads access a
map concurrently, and at least one of the threads modifies the map
structurally, it must be synchronized externally. (A structural modification is
any operation that adds or deletes one or more mappings; merely changing the
value associated with an existing key is not a structural modification.)
{quote}
Concurrent structural modifications on non thread safe maps can cause endless
loops. Am I correct here? I have seen some issues with Maps like this in the
past. With HashMap I think. Pls correct me if I am wrong
was (Author: anoopsamjohn):
I think I got the issue with Region close.
Here in test multiple regions getting closed concurrently.
All the MetricsRegionSourceImpl objects(One per region) share same
MetricsRegionAggregateSourceImpl instance
Refer code in MetricsRegionServerSourceFactoryImpl
{code}
private synchronized MetricsRegionAggregateSourceImpl getAggregate() {
if (FactoryStorage.INSTANCE.aggImpl == null) {
FactoryStorage.INSTANCE.aggImpl = new MetricsRegionAggregateSourceImpl();
}
return FactoryStorage.INSTANCE.aggImpl;
}
@Override
public MetricsRegionSource createRegion(MetricsRegionWrapper wrapper) {
return new MetricsRegionSourceImpl(wrapper, getAggregate());
}
{code}
So concurrent calls for MetricsRegionAggregateSourceImpl#deregister()
TreeSet<MetricsRegionSourceImpl> regionSources being not thread safe can create
issue.
>From TreeSet's javadoc
{quote}
Note that this implementation is not synchronized. If multiple threads access a
map concurrently, and at least one of the threads modifies the map
structurally, it must be synchronized externally. (A structural modification is
any operation that adds or deletes one or more mappings; merely changing the
value associated with an existing key is not a structural modification.)
{quote}
Concurrent structural modifications on non thread safe maps can cause endless
loops.
> HBase can be stuck in the shutdown
> ----------------------------------
>
> Key: HBASE-7871
> URL: https://issues.apache.org/jira/browse/HBASE-7871
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.96.0
> Reporter: Nicolas Liochon
> Attachments: s1.txt, TestStartStop.java
>
>
> The attached test fails ~1% of the the time on 0.96. It seems it does not
> fail on 0.94.5. It's simple: a table creation and some puts.
> I attach the stack. Logs says nothing it seems.
> The suspicious part is:
> {noformat}
> "RS_CLOSE_REGION-localhost,57575,1361197489166-2" prio=10
> tid=0x00007fb0c8775800 nid=0x61ac runnable [0x00007fb09f272000]
> java.lang.Thread.State: RUNNABLE
> at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
> at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
> at java.util.TreeMap.remove(TreeMap.java:585)
> at java.util.TreeSet.remove(TreeSet.java:259)
> at
> org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
> at
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
> at
> org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
> - locked <0x00000006944e2558> (a java.lang.Object)
> at
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
> at
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira