[ 
https://issues.apache.org/jira/browse/HBASE-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-1873.
--------------------------

    Resolution: Invalid

Made invalid by HBASE-2692; messages are passed via zk.

> Race condition around HRegionServer -> HMaster communication
> ------------------------------------------------------------
>
>                 Key: HBASE-1873
>                 URL: https://issues.apache.org/jira/browse/HBASE-1873
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Eric Tschetter
>
> HRegionServer on lines 459 - 463 (part of run()) accesses outboundMsgs in a 
> synchronized fashion, but other uses of the object are not synchronized.
> Specifically, the code is
> {noformat}
> 459          synchronized(this.outboundMsgs) {
> 460            outboundArray =
> 461              this.outboundMsgs.toArray(new HMsg[outboundMsgs.size()]); 
> 462            this.outboundMsgs.clear();
> 463          }
> {noformat}
> Whereas things are added to this list from calls like
> {noformat}
>   private void reportOpen(HRegionInfo region) {
>     outboundMsgs.add(new HMsg(HMsg.Type.MSG_REPORT_OPEN, region));
>   }
> {noformat}
> And
> {noformat}
>   void reportSplit(HRegionInfo oldRegion, HRegionInfo newRegionA,
>       HRegionInfo newRegionB) {
>     outboundMsgs.add(new HMsg(HMsg.Type.MSG_REPORT_SPLIT, oldRegion,
>       ("Daughters; " +
>         newRegionA.getRegionNameAsString() + ", " +
>         newRegionB.getRegionNameAsString()).getBytes()));
>     outboundMsgs.add(new HMsg(HMsg.Type.MSG_REPORT_OPEN, newRegionA));
>     outboundMsgs.add(new HMsg(HMsg.Type.MSG_REPORT_OPEN, newRegionB));
>   }
> {noformat}
> It looks like the object is initialized as
> {noformat}
>   private final List<HMsg> outboundMsgs =
>     Collections.synchronizedList(new ArrayList<HMsg>());
> {noformat}
> Which would appear to provide security, but it doesn't actually prevent an 
> insert from happening between lines 461 and 462, which would subsequently get 
> removed from the call to clear().  At least, from the Sun HotSpot source 
> code, it looks like Collections.synchronizedList() does the right thing and 
> synchronizes on an inner mutex object instead of synchronizing on the 
> externally visible list object itself.  That means, however, that the 
> synchronized() on line 459 is largely meaningless.
> I'm not sure how often this race condition would occur in the wild, but every 
> thread waiting on the mutex around the toArray() call increases the 
> probability that the next person to get the mutex is someone who wants to add 
> something to the List, rather than the thread calling clear().
> Simple fix would be to do external synchronization around all accesses to the 
> List.  Barring that, perhaps a SynchronizedList implementation with a 
> "emptyToArray()" method that encapsulates the toArray() and subsequent 
> clear().

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to