[
https://issues.apache.org/jira/browse/HDFS-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902285#comment-14902285
]
Vinayakumar B commented on HDFS-8771:
-------------------------------------
I think its a good idea to make purge asynchronous to unblock write requests.
Some comments about the patch.
1. {{void purgeDataOlderThan(final long minTxIdToKeep) throws IOException {}}
Here no exception will be thrown from this method now, so now can remove
{{throws}}.
2. {{setUncaughtExceptionHandler(UncaughtExceptionHandlers.systemExit())}}
I think, shutting down entire JN on IOException during purge may not be good.
During purge only call which results in IOE is {{FileUtil.listFiles(dir)}},
which might be due to disk error. Since this exception cannot be propogated
back to NN, I feel it would be better to handle inside {{call()}} and log a
WARN. Let further synchronous write requests handle the IOE as required. For
any other exceptions let JN shutdown, its okay.
[~andrew.wang] / [~jingzhao], do you want to take a look here. ?
> If IPCLoggerChannel#purgeLogsOlderThan takes too long, Namenode could not
> send another RPC calls to Journalnodes
> ----------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-8771
> URL: https://issues.apache.org/jira/browse/HDFS-8771
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Takuya Fukudome
> Assignee: Kanaka Kumar Avvaru
> Attachments: HDFS-8771-01.patch, HDFS-8771-02.patch,
> HDFS-8771-03.patch
>
>
> In our cluster, edits has became huge(about 50GB) accidentally and our
> Jounalnodes' disks were busy, therefore {{purgeLogsOlderThan}} took more than
> 30secs. If {{IPCLoggerChannel#purgeLogsOlderThan}} takes too much time,
> Namenode couldn't send other RPC calls to Journalnodes because
> {{o.a.h.hdfs.qjournal.client.IPCLoggerChannel}}'s executor is single thread.
> It will cause namenode shutting down.
> I think IPCLoggerChannel#purgeLogsOlderThan should not block other RPC calls
> like sendEdits.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)