[
https://issues.apache.org/jira/browse/HDFS-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tomasz Nykiel updated HDFS-2490:
--------------------------------
Attachment: FSNamesystemLock.java
> Upgradable lock to allow simutaleous read operation while reportDiff is in
> progress in processing block reports
> ---------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-2490
> URL: https://issues.apache.org/jira/browse/HDFS-2490
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: name-node
> Reporter: Tomasz Nykiel
> Assignee: Tomasz Nykiel
> Attachments: FSNamesystemLock.java
>
>
> Currently, FSNamesystem operations are protected by a single
> ReentrantReadWriteLock, which allows for having multiple concurrent readers
> to perform reads, and a single writer to perform writes. There are, however,
> operations whose execution has primarily reading nature, but occasionally
> they write.
> The finest example is processing block reports - currently the entire
> processing is done under writeLock(). With HDFS-395 (explicit deletion acks),
> processing a block report is primarily a read operation (reportDiff()) after
> which only very few blocks need to be updated. In fact, we noticed this
> number to be very low, or even zero blocks.
> It would be desirable to have an upgradeable read lock, which would allow for
> performing other reads during the first "read" part of reportDiff() (and
> possibly other operations.
> We implemented such mechanism, which provides writeLock(), readLock(),
> upgradeableReadLock, upgradeLock(), and downgradeLock(). I achieved this be
> emloying two ReentrantReadWriteLock's - one protects writes (lock1), the
> other one reads (lock2).
> Hence, we have:
> writeLock()
> lock1.writeLock().lock()
> lock2.writeLock().lock()
> readLock()
> lock2.readLock().lock()
> upgradeableReadLock()
> lock1.writeLock().lock()
> upgrade()
> lock2.writeLock().lock()
> --------------------------
> Hence a writeLock() is essentially equivalent to upgradeableLock()+upgrade()
> - two writeLocks are mutually exclusive because of lock1.writeLock
> - a writeLock and upgradeableLock are mutually exclusive as above
> - readLock is mutually exclusive with upgradeableLock()+upgrade() OR
> writeLock because of lock2.writeLock
> - readLock() + writeLock() causes a deadlock, the same as currently
> - writeLock() + readLock() does not cause deadlocks
> --------------------------
> I am conviced to the soundness of this mechanism.
> The overhead comes from having two locks, and in particular, writes need to
> acquire both of them.
> We deployed this feature, we used the upgradeableLock() ONLY for processing
> reports.
> Our initial, but not exhaustive experiments have shown that it had a very
> detrimental effect on the NN throughput - writes were taking up to twice as
> long.
> This is very unexpected, and hard to explain by only the overhead of
> acquiring additional lock for writes.
> I would like to ask for input, as maybe I am missing some fundamental problem
> here.
> I am attaching a java class which implements this locking mechanism.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira