[
https://issues.apache.org/jira/browse/HDFS-7726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302616#comment-14302616
]
Zhe Zhang commented on HDFS-7726:
---------------------------------
Good catch [~tianyin]. It makes sense to catch config errors at initiation time.
# I think we should check {{rpcTimeout}} is positive as well
# Shouldn't we use {{Preconditions.checkArgument}}?
Nits:
# We usually should keep each line below 80 chars
# Lines 7~8 of the patch are unnecessary
# It'd be nice if the patch is suffixed '.patch' :)
> Parse and check the configuration settings of edit log to prevent runtime
> errors
> --------------------------------------------------------------------------------
>
> Key: HDFS-7726
> URL: https://issues.apache.org/jira/browse/HDFS-7726
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.6.0
> Reporter: Tianyin Xu
> Priority: Minor
> Attachments: check_config_val_EditLogTailer.patch.1
>
>
> ============================
> Problem
> -------------------------------------------------
> Similar as the following two issues addressed in 2.7.0,
> https://issues.apache.org/jira/browse/YARN-2165
> https://issues.apache.org/jira/browse/YARN-2166
> The edit log related configuration settings should be checked in the
> constructor rather than being applied directly at runtime. This would cause
> runtime failures if the values are wrong.
> Take "dfs.ha.tail-edits.period" as an example, currently in
> EditLogTailer.java, its value is not checked but directly used in doWork(),
> as the following code snippets. Any negative values would cause
> IllegalArgumentException (which is not caught) and impair the component.
> {code:title=EditLogTailer.java|borderStyle=solid}
> private void doWork() {
> {
> .....
> Thread.sleep(sleepTimeMs);
> ....
> }
> {code}
> Another example is "dfs.ha.log-roll.rpc.timeout". Right now, we use getInt()
> to parse the value at runtime in the getActiveNodeProxy() function which is
> called by doWork(), shown as below. Any erroneous settings (e.g.,
> ill-formatted integer) would cause exceptions.
> {code:title=EditLogTailer.java|borderStyle=solid}
> private NamenodeProtocol getActiveNodeProxy() throws IOException {
> {
> .....
> int rpcTimeout = conf.getInt(
> DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_KEY,
> DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_DEFAULT);
> ....
> }
> {code}
> ============================
> Solution (the attached patch)
> -------------------------------------------------
> Basically, the idea of the attached patch is to move the parsing and checking
> logics into the constructor to expose the error at initialization, so that
> the errors won't be latent at the runtime (same as YARN-2165 and YARN-2166)
> I'm not aware of the implementation of 2.7.0. It seems there's checking
> utilities such as the validatePositiveNonZero function in YARN-2165. If so,
> we can use that one to make the checking more systematic.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)