[
https://issues.apache.org/jira/browse/HBASE-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack resolved HBASE-1873.
--------------------------
Resolution: Invalid
Made invalid by HBASE-2692; messages are passed via zk.
> Race condition around HRegionServer -> HMaster communication
> ------------------------------------------------------------
>
> Key: HBASE-1873
> URL: https://issues.apache.org/jira/browse/HBASE-1873
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.20.0
> Reporter: Eric Tschetter
>
> HRegionServer on lines 459 - 463 (part of run()) accesses outboundMsgs in a
> synchronized fashion, but other uses of the object are not synchronized.
> Specifically, the code is
> {noformat}
> 459 synchronized(this.outboundMsgs) {
> 460 outboundArray =
> 461 this.outboundMsgs.toArray(new HMsg[outboundMsgs.size()]);
> 462 this.outboundMsgs.clear();
> 463 }
> {noformat}
> Whereas things are added to this list from calls like
> {noformat}
> private void reportOpen(HRegionInfo region) {
> outboundMsgs.add(new HMsg(HMsg.Type.MSG_REPORT_OPEN, region));
> }
> {noformat}
> And
> {noformat}
> void reportSplit(HRegionInfo oldRegion, HRegionInfo newRegionA,
> HRegionInfo newRegionB) {
> outboundMsgs.add(new HMsg(HMsg.Type.MSG_REPORT_SPLIT, oldRegion,
> ("Daughters; " +
> newRegionA.getRegionNameAsString() + ", " +
> newRegionB.getRegionNameAsString()).getBytes()));
> outboundMsgs.add(new HMsg(HMsg.Type.MSG_REPORT_OPEN, newRegionA));
> outboundMsgs.add(new HMsg(HMsg.Type.MSG_REPORT_OPEN, newRegionB));
> }
> {noformat}
> It looks like the object is initialized as
> {noformat}
> private final List<HMsg> outboundMsgs =
> Collections.synchronizedList(new ArrayList<HMsg>());
> {noformat}
> Which would appear to provide security, but it doesn't actually prevent an
> insert from happening between lines 461 and 462, which would subsequently get
> removed from the call to clear(). At least, from the Sun HotSpot source
> code, it looks like Collections.synchronizedList() does the right thing and
> synchronizes on an inner mutex object instead of synchronizing on the
> externally visible list object itself. That means, however, that the
> synchronized() on line 459 is largely meaningless.
> I'm not sure how often this race condition would occur in the wild, but every
> thread waiting on the mutex around the toArray() call increases the
> probability that the next person to get the mutex is someone who wants to add
> something to the List, rather than the thread calling clear().
> Simple fix would be to do external synchronization around all accesses to the
> List. Barring that, perhaps a SynchronizedList implementation with a
> "emptyToArray()" method that encapsulates the toArray() and subsequent
> clear().
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.