[
https://issues.apache.org/jira/browse/HDFS-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13797476#comment-13797476
]
Suresh Srinivas commented on HDFS-5367:
---------------------------------------
I agree that when storage directories are being restored during rollEditLog,
saving fsimage that will soon be replaced by new checkpointed fsimage seems
unnecessary.
+1 for the patch. I will commit it soon to branch-1.
bq. John , could you please provide a patch for trunk as well ?
Trunk is a lot different from branch-1. Let me know if you need help. Based on
the analysis, this change may not be needed on trunk.
> Restore fsimage locked NameNode too long when the size of fsimage are big
> -------------------------------------------------------------------------
>
> Key: HDFS-5367
> URL: https://issues.apache.org/jira/browse/HDFS-5367
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: zhaoyunjiong
> Assignee: zhaoyunjiong
> Attachments: HDFS-5367-branch-1.2.patch
>
>
> Our cluster have 40G fsimage, we write one copy of edit log to NFS.
> After NFS temporary failed, when doing checkpoint, NameNode try to recover
> it, and it will save 40G fsimage to NFS, it takes some time (> 40G/128MB/s =
> 320 seconds) , and it locked FSNamesystem, and this bring down our cluster.
--
This message was sent by Atlassian JIRA
(v6.1#6144)