[ https://issues.apache.org/jira/browse/HDFS-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rakesh R updated HDFS-13165: ---------------------------- Attachment: HDFS-13165-HDFS-10285-06.patch > [SPS]: Collects successfully moved block details via IBR > -------------------------------------------------------- > > Key: HDFS-13165 > URL: https://issues.apache.org/jira/browse/HDFS-13165 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Rakesh R > Assignee: Rakesh R > Priority: Major > Attachments: HDFS-13165-HDFS-10285-00.patch, > HDFS-13165-HDFS-10285-01.patch, HDFS-13165-HDFS-10285-02.patch, > HDFS-13165-HDFS-10285-03.patch, HDFS-13165-HDFS-10285-04.patch, > HDFS-13165-HDFS-10285-05.patch, HDFS-13165-HDFS-10285-06.patch > > > This task to make use of the existing IBR to get moved block details and > remove unwanted future tracking logic exists in BlockStorageMovementTracker > code, this is no more needed as the file level tracking maintained at NN > itself. > Following comments taken from HDFS-10285, > [here|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16347472&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16347472] > Comment-3) > {quote}BPServiceActor > Is it actually sending back the moved blocks? Aren’t IBRs sufficient?{quote} > Comment-21) > {quote} > BlockStorageMovementTracker > Many data structures are riddled with non-threadsafe race conditions and risk > of CMEs. > Ex. The moverTaskFutures map. Adding new blocks and/or adding to a block's > list of futures is synchronized. However the run loop does an unsynchronized > block get, unsynchronized future remove, unsynchronized isEmpty, possibly > another unsynchronized get, only then does it do a synchronized remove of the > block. The whole chunk of code should be synchronized. > Is the problematic moverTaskFutures even needed? It's aggregating futures > per-block for seemingly no reason. Why track all the futures at all instead > of just relying on the completion service? As best I can tell: > It's only used to determine if a future from the completion service should be > ignored during shutdown. Shutdown sets the running boolean to false and > clears the entire datastructure so why not use the running boolean like a > check just a little further down? > As synchronization to sleep up to 2 seconds before performing a blocking > moverCompletionService.take, but only when it thinks there are no active > futures. I'll ignore the missed notify race that the bounded wait masks, but > the real question is why not just do the blocking take? > Why all the complexity? Am I missing something? > BlocksMovementsStatusHandler > Suffers same type of thread safety issues as StoragePolicySatisfyWorker. Ex. > blockIdVsMovementStatus is inconsistent synchronized. Does synchronize to > return an unmodifiable list which sadly does nothing to protect the caller > from CME. > handle is iterating over a non-thread safe list. > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org