[ 
https://issues.apache.org/jira/browse/HDFS-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011753#comment-13011753
 ] 

dhruba borthakur commented on HDFS-1790:
----------------------------------------

incVolumeFailure(), isInSafeMode() and isPopulatingReplQueues() should use the 
FSNamesystem read/write lock.

> review use of synchronized methods in FSNamesystem
> --------------------------------------------------
>
>                 Key: HDFS-1790
>                 URL: https://issues.apache.org/jira/browse/HDFS-1790
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Matt Foley
>             Fix For: 0.23.0
>
>
> While analyzing SafeMode semantics for HDFS-1726, I noticed the following 
> usage of synchronized methods in FSNamesystem that raised questions:
> 1. incVolumeFailure() is synchronized on the FSNamesystem instance (monitor 
> lock), for unclear reasons.  Is this a left-over from the conversion to 
> read/write lock in FSNamesystem?
> 2. isInSafeMode() and isPopulatingReplQueues() are also synchronized on the 
> FSNamesystem instance (monitor lock), but the SafeModeInfo methods they call 
> (safeMode.isInSafeMode() and safeMode.isPopulatingReplQueues() are 
> synchronized on the SafeModeInfo instance.  Is synchronizing on the 
> FSNamesystem instance necessary?  What does it add?  If it is necessary, 
> should it be using the read/write lock?
> 3. In SafeModeInfo, these methods are synchronized on the SafeModeInfo 
> instance:
> - isOn
> - isPopulatingReplQueues
> - leave
> - initialize
> - canInitializeReplQueues
> - canLeave
> - setBlock
> - incrementSafeBlockCount
> - decrementSafeBlockCount
> - setManual
> but these are not:
> - enter
> - needEnter
> - checkMode
> - isManual
> - isConsistent
> Regarding these:
> - isOn() asserts isConsistent(), but is otherwise an atomic read operation.
> - isPopulatingReplQueues() is an atomic read.  Does it need synchronization?
> - leave() is complex, but shouldn't it be synchronized with enter(), which is 
> also a write operation?  Yet enter() is unsynchronized.
> - initializeReplQueues() calls blockManager.processMisReplicatedBlocks(), 
> which can take minutes to run.  By synchronizing on the SafeModeInfo 
> instance, it prevents essentially all of the other safeMode methods from 
> running for the duration!  Is this desirable or needed?
> - canInitializeReplQueues() is a read-only operation, although it does 
> compare two read values.  needEnter() compares four read values, and is 
> unsynchronized.  Does either need synchronization?
> - canLeave() is a compound operation, it's good that it is synchronized.
> - checkMode() is a big compound read/write operation.  Doesn't it need 
> synchronization if the other methods do?
> and so on.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to