Upgradable lock to allow simutaleous read operation while reportDiff is in 
progress in processing block reports
---------------------------------------------------------------------------------------------------------------

                 Key: HDFS-2490
                 URL: https://issues.apache.org/jira/browse/HDFS-2490
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: Tomasz Nykiel
            Assignee: Tomasz Nykiel
         Attachments: FSNamesystemLock.java

Currently, FSNamesystem operations are protected by a single 
ReentrantReadWriteLock, which allows for having multiple concurrent readers to 
perform reads, and a single writer to perform writes. There are, however, 
operations whose execution has primarily reading nature, but occasionally they 
write.
The finest example is processing block reports - currently the entire 
processing is done under writeLock(). With HDFS-395 (explicit deletion acks), 
processing a block report is primarily a read operation (reportDiff()) after 
which only very few blocks need to be updated. In fact, we noticed this number 
to be very low, or even zero blocks.

It would be desirable to have an upgradeable read lock, which would allow for 
performing other reads during the first "read" part of reportDiff() (and 
possibly other operations.

We implemented such mechanism, which provides writeLock(), readLock(), 
upgradeableReadLock, upgradeLock(), and downgradeLock(). I achieved this be 
emloying two ReentrantReadWriteLock's - one protects writes (lock1), the other 
one reads (lock2).

Hence, we have:

writeLock()
  lock1.writeLock().lock()
  lock2.writeLock().lock()

readLock()
  lock2.readLock().lock()

upgradeableReadLock()
  lock1.writeLock().lock()

upgrade()
  lock2.writeLock().lock()

--------------------------

Hence a writeLock() is essentially equivalent to upgradeableLock()+upgrade()
- two writeLocks are mutually exclusive because of lock1.writeLock
- a writeLock and upgradeableLock are mutually exclusive as above
- readLock is mutually exclusive with upgradeableLock()+upgrade() OR writeLock 
because of lock2.writeLock
- readLock() + writeLock() causes a deadlock, the same as currently
- writeLock() + readLock() does not cause deadlocks

--------------------------

I am conviced to the soundness of this mechanism.
The overhead comes from having two locks, and in particular, writes need to 
acquire both of them.
We deployed this feature, we used the upgradeableLock() ONLY for processing 
reports.
Our initial, but not exhaustive experiments have shown that it had a very 
detrimental effect on the NN throughput - writes were taking up to twice as 
long.
This is very unexpected, and hard to explain by only the overhead of acquiring 
additional lock for writes.

I would like to ask for input, as maybe I am missing some fundamental problem 
here.
I am attaching a java class which implements this locking mechanism.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to