[ 
https://issues.apache.org/jira/browse/HDFS-7915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359377#comment-14359377
 ] 

Colin Patrick McCabe commented on HDFS-7915:
--------------------------------------------

bq. cnauroth asked: Thanks for the patch, Colin. The change looks good. In the 
test, is the Visitor indirection necessary, or would it be easier to add 2 
VisibleForTesting getters that return the segments and slots directly to the 
test code?

The problem is locking.  If there is a getter for these hash tables, is the 
caller going to take the appropriate locks when accessing them?  If not, we get 
findbugs warnings and possibly actual test bugs.  If so, it adds a lot of 
coupling between the unit test and the registry code.  In contrast, the visitor 
interface lets the unit test see a single consistent snapshot of what is going 
on in the {{ShortCircuitRegistry}}.

> The DataNode can sometimes allocate a ShortCircuitShm slot and fail to tell 
> the DFSClient about it because of a network error
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7915
>                 URL: https://issues.apache.org/jira/browse/HDFS-7915
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.7.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-7915.001.patch, HDFS-7915.002.patch, 
> HDFS-7915.004.patch
>
>
> The DataNode can sometimes allocate a ShortCircuitShm slot and fail to tell 
> the DFSClient about it because of a network error.  In 
> {{DataXceiver#requestShortCircuitFds}}, the DataNode can succeed at the first 
> part (mark the slot as used) and fail at the second part (tell the DFSClient 
> what it did). The "try" block for unregistering the slot only covers a 
> failure in the first part, not the second part. In this way, a divergence can 
> form between the views of which slots are allocated on DFSClient and on 
> server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to