[
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhe Zhang updated HDFS-8303:
----------------------------
Description:
As the first step of the consolidation effort, QJM should call its FJM to purge
the current directory.
The current QJM logic of purging current dir is very similar to FJM purging
logic.
QJM:
{code}
private static final List<Pattern> CURRENT_DIR_PURGE_REGEXES =
ImmutableList.of(
Pattern.compile("edits_\\d+-(\\d+)"),
Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
...
long txid = Long.parseLong(matcher.group(1));
if (txid < minTxIdToKeep) {
LOG.info("Purging no-longer needed file " + txid);
if (!f.delete()) {
...
{code}
FJM:
{code}
private static final Pattern EDITS_REGEX = Pattern.compile(
NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
...
List<EditLogFile> editLogs = matchEditLogs(files, true);
for (EditLogFile log : editLogs) {
if (log.getFirstTxId() < minTxIdToKeep &&
log.getLastTxId() < minTxIdToKeep) {
purger.purgeLog(log);
}
}
{code}
I can see 2 differences:
# When matching for empty/corrupt in-progress files, QJM requires that the
suffix doesn't have blank spaces
# FJM verifies that both start and end txID of a finalized edit file to be old
enough
Both seem safer than the QJM logic.
was:
As the first step of the consolidation effort, QJM should call its FJM to purge
the current directory.
The current QJM logic of purging current dir is very similar to FJM purging
logic.
QJM:
{code}
private static final List<Pattern> CURRENT_DIR_PURGE_REGEXES =
ImmutableList.of(
Pattern.compile("edits_\\d+-(\\d+)"),
Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
...
long txid = Long.parseLong(matcher.group(1));
if (txid < minTxIdToKeep) {
LOG.info("Purging no-longer needed file " + txid);
if (!f.delete()) {
...
{code}
FJM:
{code}
private static final Pattern EDITS_REGEX = Pattern.compile(
NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
...
List<EditLogFile> editLogs = matchEditLogs(files, true);
for (EditLogFile log : editLogs) {
if (log.getFirstTxId() < minTxIdToKeep &&
log.getLastTxId() < minTxIdToKeep) {
purger.purgeLog(log);
}
}
{code}
I can see 2 differences:
# FJM has a slightly stricter match for empty/corrupt in-progress files: the
suffix shouldn't have blank space
# FJM verifies that both start and end txID of a finalized edit file to be old
enough
Both seem safer than the QJM logic.
> QJM should purge old logs in the current directory through FJM
> --------------------------------------------------------------
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Zhe Zhang
> Assignee: Zhe Zhang
>
> As the first step of the consolidation effort, QJM should call its FJM to
> purge the current directory.
> The current QJM logic of purging current dir is very similar to FJM purging
> logic.
> QJM:
> {code}
> private static final List<Pattern> CURRENT_DIR_PURGE_REGEXES =
> ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
> long txid = Long.parseLong(matcher.group(1));
> if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
> private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
> private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
> private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List<EditLogFile> editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
> if (log.getFirstTxId() < minTxIdToKeep &&
> log.getLastTxId() < minTxIdToKeep) {
> purger.purgeLog(log);
> }
> }
> {code}
> I can see 2 differences:
> # When matching for empty/corrupt in-progress files, QJM requires that the
> suffix doesn't have blank spaces
> # FJM verifies that both start and end txID of a finalized edit file to be
> old enough
> Both seem safer than the QJM logic.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)