[ 
https://issues.apache.org/jira/browse/HDFS-17818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caozhiqiang updated HDFS-17818:
-------------------------------
    Status: Patch Available  (was: Open)

> Fix serial fsimage transfer during checkpoint with multiple namenodes
> ---------------------------------------------------------------------
>
>                 Key: HDFS-17818
>                 URL: https://issues.apache.org/jira/browse/HDFS-17818
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.5.0
>            Reporter: caozhiqiang
>            Assignee: caozhiqiang
>            Priority: Major
>              Labels: pull-request-available
>
> In our cluster, each namespace has four NameNodes: one active, one standby, 
> and two observers. When the standby NameNode performs a checkpoint, it 
> transfer the fsimage to the other three NameNodes. However, we found that 
> these transfer are performed serially.
> The reason is that the corePoolSize in ThreadPoolExecutor is 0, and the 
> transfer task does not fill the LinkedBlockingQueue, resulting in only one 
> thread transfer the fsimage at a time. This greatly increases the checkpoint 
> time.
> {code:java}
>     ExecutorService executor = new ThreadPoolExecutor(0, 
> activeNNAddresses.size(), 100,
>         TimeUnit.MILLISECONDS, new 
> LinkedBlockingQueue<Runnable>(activeNNAddresses.size()),
>         uploadThreadFactory); {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to