[ 
https://issues.apache.org/jira/browse/HDFS-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991358#comment-15991358
 ] 

Zhe Zhang commented on HDFS-11714:
----------------------------------

Thanks for the clarification [~kihwal]. +1 on the latest patch pending cosmetic 
comments #2~3. Comment #1 is up to you.

bq. The equivalent code for non-HA case (saveNamespace) also unconditionally 
overwrites existing VERSION. The reasoning is, regardless of previous state, 
now it has the up-to-date checkpoint, so it should have an accompanying VERSION 
file. So it is expected to overwrite if a VERSION already exists. I don't think 
we need to do anything here.
Agreed.

bq. At minimum, it already logs a WARN. What do you think should be done?
My question was more on the retention behavior: should the retention manager 
catch the exception, do some logging, and keep purging old fsimage files? In 
either case, I don't think it should block this JIRA -- this JIRA already fixed 
the most immediate issue.

> Newly added NN storage directory won't get initialized and cause space 
> exhaustion
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-11714
>                 URL: https://issues.apache.org/jira/browse/HDFS-11714
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.7.3
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>         Attachments: HDFS-11714.trunk.patch, HDFS-11714.v2.branch-2.patch, 
> HDFS-11714.v2.trunk.patch
>
>
> When an empty namenode storage directory is detected on normal NN startup, it 
> may not be fully initialized. The new directory is still part of "in-service" 
> NNStrage and when a checkpoint image is uploaded, a copy will also be written 
> there.  However, the retention manager won't be able to purge old files since 
> it is lacking a VERSION file.  This causes fsimages to pile up in the 
> directory.  With a big name space, the disk will be filled in the order of 
> days or weeks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to