[jira] [Commented] (HDFS-8771) If IPCLoggerChannel#purgeLogsOlderThan takes too long, Namenode could not send another RPC calls to Journalnodes

Andrew Wang (JIRA) Mon, 05 Oct 2015 17:28:01 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944311#comment-14944311
 ]


Andrew Wang commented on HDFS-8771:
-----------------------------------

Seems okay to me, few quick comments:

* Can we name the thread "JNStorageEditLogPurger"? We normally do something 
camel cased and based on the classname if possible.
* Let's avoid having sleeps in tests, instead let's add some test hooks for 
notification.
* I like the new log messages, mind also adding the duration and # files 
deleted to the "end" log message? Would be good to do this separately for the 
CURRENT and PAXOS dirs too.
* Since purges are cumulative, there's no point running a purge for txid 100 if 
there's a queued purge for txid 200. Let's do some coalescing to avoid 
unnecessary repeated directory scans.

> If IPCLoggerChannel#purgeLogsOlderThan takes too long, Namenode could not 
> send another RPC calls to Journalnodes
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8771
>                 URL: https://issues.apache.org/jira/browse/HDFS-8771
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Takuya Fukudome
>            Assignee: Kanaka Kumar Avvaru
>         Attachments: HDFS-8771-01.patch, HDFS-8771-02.patch, 
> HDFS-8771-03.patch
>
>
> In our cluster, edits has became huge(about 50GB) accidentally and our 
> Jounalnodes' disks were busy, therefore {{purgeLogsOlderThan}} took more than 
> 30secs. If {{IPCLoggerChannel#purgeLogsOlderThan}} takes too much time, 
> Namenode couldn't send other RPC calls to Journalnodes because 
> {{o.a.h.hdfs.qjournal.client.IPCLoggerChannel}}'s executor is single thread. 
> It will cause namenode shutting down.
> I think IPCLoggerChannel#purgeLogsOlderThan should not block other RPC calls 
> like sendEdits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8771) If IPCLoggerChannel#purgeLogsOlderThan takes too long, Namenode could not send another RPC calls to Journalnodes

Reply via email to