[
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14521002#comment-14521002
]
Zhe Zhang commented on HDFS-8178:
---------------------------------
Thanks ATM for the helpful review! After looking at HDFS-5919 more closely, we
are actually trying to solve a different problem here. The objective of
HDFS-5919 is sorely to save disk space (since FJM doesn't try to process those
corrupt/empty files anyway). It's a safe cleanup, making sure the tx ID of
empty / corrupt files are old enough before purging. So I think we should do
the same in QJM.
Our main target here is _stale_ in-progress edit log files, which are not
necessarily empty/corrupt (so they won't be mark as so). As the updated
description states, we want to properly take care of those files so QJM doesn't
try to process them. I like your proposal of rename / move aside those files
and remove them when they are older than {{minTxIdToKeep}}. I'll update the
patch based on this idea.
I also propose we do the same for corrupt / empty files, for both FJM and QJM.
> QJM doesn't move aside stale inprogress edits files
> ---------------------------------------------------
>
> Key: HDFS-8178
> URL: https://issues.apache.org/jira/browse/HDFS-8178
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: qjm
> Reporter: Zhe Zhang
> Assignee: Zhe Zhang
> Attachments: HDFS-8178.000.patch
>
>
> When a QJM crashes, the in-progress edit log file at that time remains in the
> file system. When the node comes back, it will accept new edit logs and those
> stale in-progress files are never cleaned up. QJM treats them as regular
> in-progress edit log files and tries to finalize them, which potentially
> causes high memory usage. This JIRA aims to move aside those stale edit log
> files to avoid this scenario.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)