[
https://issues.apache.org/jira/browse/HDFS-5341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799001#comment-13799001
]
qus-jiawei commented on HDFS-5341:
----------------------------------
Hi Prakash :Thanks for help.
In out Hadoop cluster, some datanode store nearly 800 thousand blocks and 16
Disks.
when the disk's IO with high presure,the file getlength funciotn could be slow.
And it will hold the lock dataset too long.
> DirectoryScanner hold the object lock dataset too long in scan,it make the
> datanode turn into deadnode and block all reciving thread
> -------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-5341
> URL: https://issues.apache.org/jira/browse/HDFS-5341
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Affects Versions: 2.0.4-alpha
> Reporter: qus-jiawei
>
> When DirectoryScanner doing the scan function, it holding the dataset to diff
> the block info between memory and disk.But it do a lot of disk operation
> because it call the file's getlength funciton.
> So,such disk operation should move to the diskreport.
> I don't have an commit acount.So I post the patch here.
> 163d162
> < private final long blockFileLength;
> 174,179d172
> < if( this.blockFile != null ){
> < this.blockFileLength = this.blockFile.length();
> < }
> < else{
> < this.blockFileLength = 0;
> < }
> 190,193d182
> <
> < long getBlockFileLength(){
> < return blockFileLength;
> < }
> 249,256c238
> < new Daemon.DaemonFactory(){
> < @Override
> < public Thread newThread(Runnable
> runnable) {
> < Thread
> t=super.newThread(runnable);
> <
> t.setPriority(Thread.NORM_PRIORITY);
> < return t;
> < }
> < });
> ---
> > new Daemon.DaemonFactory());
> 358d339
> < LOG.info("UCADD check and update finish");
> 368d348
> < long begin = System.currentTimeMillis();
> 370c350
> < LOG.info("UCADD finish diskReport
> using:"+(System.currentTimeMillis()-begin)+"ms");
> ---
> >
> 373,375d352
> < begin = System.currentTimeMillis();
> < int diskHit = 0;
> < LOG.info("UCADD begin to synchronized");
> 415,423c392,396
> < || info.getBlockFileLength() !=
> memBlock.getNumBytes() ) {
> < //double check the block file length
> < diskHit++;
> < if(info.getBlockFile().length() !=
> memBlock.getNumBytes()){
> < // Block metadata file is missing or has
> wrong generation stamp,
> < // or block file length is different than
> expected
> < statsRecord.mismatchBlocks++;
> < addDifference(diffRecord, statsRecord,
> info);
> < }
> ---
> > || info.getBlockFile().length() !=
> memBlock.getNumBytes()) {
> > // Block metadata file is missing or has wrong
> generation stamp,
> > // or block file length is different than expected
> > statsRecord.mismatchBlocks++;
> > addDifference(diffRecord, statsRecord, info);
> 437d409
> < LOG.info("UCADD end synchronized
> using:"+(System.currentTimeMillis()-begin)+"ms diskHit:"+diskHit);
> 439d410
> <
--
This message was sent by Atlassian JIRA
(v6.1#6144)