[
https://issues.apache.org/jira/browse/HDFS-15150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033592#comment-17033592
]
Stephen O'Donnell commented on HDFS-15150:
------------------------------------------
Thanks for the review.
Nice benchmarks at the link above. Its interesting the unfair lock performs
much better, but probably at the code of a long tail latency in the worst
cases. Also interesting that other locking methods perform better, but we know
the Reentrant RW lock does well in the Namenode, so I feel it should be good
for the DN too.
We will probably need a series of small Jiras to move various code paths to use
the Read lock. To start with, I have created one to address the ReplicaMap,
which is called by many other methods. I have a patch ready but I will hold off
posting it until we commit this one, as it depends on this change - HDFS-15160.
Then I will create a few more Jiras to tackle other code paths.
> Introduce read write lock to Datanode
> -------------------------------------
>
> Key: HDFS-15150
> URL: https://issues.apache.org/jira/browse/HDFS-15150
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Affects Versions: 3.3.0
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Attachments: HDFS-15150.001.patch, HDFS-15150.002.patch,
> HDFS-15150.003.patch
>
>
> HDFS-9668 pointed out the issues around the DN lock being a point of
> contention some time ago, but that Jira went in a direction of creating a new
> FSDataset implementation which is very risky, and activity on the Jira has
> stalled for a few years now. Edit: Looks like HDFS-9668 eventually went in a
> similar direction to what I was thinking, so I will review that Jira in more
> detail to see if this one is necessary.
> I feel there could be significant gains by moving to a ReentrantReadWrite
> lock within the DN. The current implementation is simply a ReentrantLock so
> any locker blocks all others.
> Once place I think a read lock would benefit us significantly, is when the DN
> is serving a lot of small blocks and there are jobs which perform a lot of
> reads. The start of reading any blocks right now takes the lock, but if we
> moved this to a read lock, many reads could do this at the same time.
> The first conservative step, would be to change the current lock and then
> make all accesses to it obtain the write lock. That way, we should keep the
> current behaviour and then we can selectively move some lock accesses to the
> readlock in separate Jiras.
> I would appreciate any thoughts on this, and also if anyone has attempted it
> before and found any blockers.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]