In general, the DN does not perform reads from files under a big lock.
We only need the lock for protecting the replica map and some of the
block state.  This lock hasn't really been a big problem in the past
and I would hesitate to add complexity here (although I haven't
thought about it that hard at all, so maybe I'm wrong!)

Are you sure that you are not hitting HDFS-7489?

In general, the client normally does some readahead of a few kb to
avoid swamping the DN with tons of tiny requests.  Tons of tiny
requests is a bad idea for many other reasons (RPC overhead, seek
overhead, etc. etc.)

You can also look into using short-circuit reads to avoid the DataNode
overhead altogether for local reads, which a lot of high-performance
systems do.

regards,
Colin

On Sat, Feb 14, 2015 at 10:43 PM, Sukunhui (iEBP) <sukun...@huawei.com> wrote:
> I have a cluster writes/reads/deletes lots of small files.
> I dump the stack of one Datenode and found out that Datanode has more than 
> 100+ sessions for reading/writing blocks. 100+ DataXceiver threads waiting to 
> lock <0x00007f9b26ce9530> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
>
> I find that DsDatasetImpl.java and ReplicaMap.java use a lot of  
> `synchronized` keyword for synchronization. It’s horrible.
> First, locking for every reading is unnecessary, and deceases concurrency.
> Second, Java monitors (synchronized/await/notify/notifyAll) are non-fair. 
> (http://stackoverflow.com/questions/11275699/synchronized-release-order), It 
> will causes many dfsclient timeout.
>
> I’m thinking we can use ReentrantReadWriteLock for synchronization. What do 
> you guys think?

Reply via email to