I have a cluster writes/reads/deletes lots of small files.
I dump the stack of one Datenode and found out that Datanode has more than 100+ 
sessions for reading/writing blocks. 100+ DataXceiver threads waiting to lock 
<0x00007f9b26ce9530> (a 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)

I find that DsDatasetImpl.java and ReplicaMap.java use a lot of  `synchronized` 
keyword for synchronization. It’s horrible.
First, locking for every reading is unnecessary, and deceases concurrency.
Second, Java monitors (synchronized/await/notify/notifyAll) are non-fair. 
(http://stackoverflow.com/questions/11275699/synchronized-release-order), It 
will causes many dfsclient timeout.

I’m thinking we can use ReentrantReadWriteLock for synchronization. What do you 
guys think?

Reply via email to