[
https://issues.apache.org/jira/browse/HDFS-14862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937794#comment-16937794
]
David Mollitor commented on HDFS-14862:
---------------------------------------
[~elgoiri] Thank you for taking a look.
I'll take a look at this again.
This class is a bit confusing, the {{getLocations()}} method included. There
is no reason that this method needs to be synchronized at all because the
variable {{locations}} is defined as {{final}} in the constructor and therefore
will never change. Since it never changes, there's no need to synchronize.
There are no comments in the code, so it's a bit hard to understand how it's
being used, but it may be a life-cycle thing. That is, multiple threads may be
used to add new locations to the block, but then at the end, only a single
thread accesses the results (through {{getLocations()}}).
However, I just realized that the blocks are also being synchronized externally
as well.
https://github.com/apache/hadoop/blob/1de25d134f64d815f9b43606fa426ece5ddbc430/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java#L831
I think I'll drop the external synchronization in {{Dispatcher.java}}, keep the
synchronization on the collection (because it is protected) and put a comment
on the {{getLocations()}} method that warns user of trying to interact with the
returned List... that it is not thread safe to change the contents and may
throw a {{ConcurrentModicationException}} if the underlying collection is
modified.
> Review of MovedBlocks
> ---------------------
>
> Key: HDFS-14862
> URL: https://issues.apache.org/jira/browse/HDFS-14862
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: balancer & mover
> Affects Versions: 3.2.0
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Minor
> Attachments: HDFS-14862.1.patch
>
>
> Internal data structure needs to be protected (synchronized) but is scoped as
> {{protected}} so any sub-class could modify without a lock. Synchronize the
> collection itself for protection. It also returns the internal data
> structure in {{getLocations}} so the structure could be modified outside of
> the lock. Create a copy instead.
> {code:java}
> /** The locations of the replicas of the block. */
> protected final List<L> locations = new ArrayList<L>(3);
>
> public Locations(Block block) {
> this.block = block;
> }
>
> /** clean block locations */
> public synchronized void clearLocations() {
> locations.clear();
> }
> ...
> /** @return its locations */
> public synchronized List<L> getLocations() {
> return locations;
> }
> {code}
>
> [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java#L43]
> Also, remove a bunch of superfluous and complicated code.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]