caozhiqiang created HDFS-17818:
----------------------------------

             Summary: Fix serial fsimage transfer during checkpoint with 
multiple namenodes
                 Key: HDFS-17818
                 URL: https://issues.apache.org/jira/browse/HDFS-17818
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: namenode
    Affects Versions: 3.5.0
            Reporter: caozhiqiang
            Assignee: caozhiqiang


In our cluster, each namespace has four NameNodes: one active, one standby, and 
two observers. When the standby NameNode performs a checkpoint, it transfer the 
fsimage to the other three NameNodes. However, we found that these transfer are 
performed serially.

The reason is that the corePoolSize in ThreadPoolExecutor is 0, and the 
transfer task does not fill the LinkedBlockingQueue, resulting in only one 
thread transfer the fsimage at a time. This greatly increases the checkpoint 
time.

 
{code:java}
    ExecutorService executor = new ThreadPoolExecutor(0, 
activeNNAddresses.size(), 100,
        TimeUnit.MILLISECONDS, new 
LinkedBlockingQueue<Runnable>(activeNNAddresses.size()),
        uploadThreadFactory); {code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to