You might also be interested in https://issues.apache.org/jira/browse/HDFS-1148 which I worked on a bit a number of years back. Per the last comment in that JIRA, I don't think it's very valuable anymore given the predominance of short-circuit reads in high performance workloads these days. If you've got some jstacks showing high contention on these locks under some workload, though, it would be interesting to see them.
-Todd On Tue, Feb 17, 2015 at 2:18 PM, Colin P. McCabe <cmcc...@apache.org> wrote: > In general, the DN does not perform reads from files under a big lock. > We only need the lock for protecting the replica map and some of the > block state. This lock hasn't really been a big problem in the past > and I would hesitate to add complexity here (although I haven't > thought about it that hard at all, so maybe I'm wrong!) > > Are you sure that you are not hitting HDFS-7489? > > In general, the client normally does some readahead of a few kb to > avoid swamping the DN with tons of tiny requests. Tons of tiny > requests is a bad idea for many other reasons (RPC overhead, seek > overhead, etc. etc.) > > You can also look into using short-circuit reads to avoid the DataNode > overhead altogether for local reads, which a lot of high-performance > systems do. > > regards, > Colin > > On Sat, Feb 14, 2015 at 10:43 PM, Sukunhui (iEBP) <sukun...@huawei.com> > wrote: > > I have a cluster writes/reads/deletes lots of small files. > > I dump the stack of one Datenode and found out that Datanode has more > than 100+ sessions for reading/writing blocks. 100+ DataXceiver threads > waiting to lock <0x00007f9b26ce9530> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > > > > I find that DsDatasetImpl.java and ReplicaMap.java use a lot of > `synchronized` keyword for synchronization. It’s horrible. > > First, locking for every reading is unnecessary, and deceases > concurrency. > > Second, Java monitors (synchronized/await/notify/notifyAll) are > non-fair. ( > http://stackoverflow.com/questions/11275699/synchronized-release-order), > It will causes many dfsclient timeout. > > > > I’m thinking we can use ReentrantReadWriteLock for synchronization. What > do you guys think? > -- Todd Lipcon Software Engineer, Cloudera