[ 
https://issues.apache.org/jira/browse/HDFS-7726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianyin Xu updated HDFS-7726:
-----------------------------
    Attachment: check_config_val_EditLogTailer.patch.1

> Parse and check the configuration settings of edit log to prevent runtime 
> errors
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-7726
>                 URL: https://issues.apache.org/jira/browse/HDFS-7726
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0
>            Reporter: Tianyin Xu
>            Priority: Minor
>         Attachments: check_config_val_EditLogTailer.patch.1
>
>
> ============================
> Problem
> -------------------------------------------------
> Similar as the following two issues addressed in 2.7.0,
> https://issues.apache.org/jira/browse/YARN-2165
> https://issues.apache.org/jira/browse/YARN-2166
> The edit log related configuration settings should be checked in the 
> constructor rather than being applied directly at runtime. This would cause 
> runtime failures if the values are wrong.
> Take "dfs.ha.tail-edits.period" as an example, currently in 
> EditLogTailer.java, its value is not checked but directly used in doWork(), 
> as the following code snippets. Any negative values would cause 
> IllegalArgumentException (which is not caught) and impair the component. 
> {code:title=EditLogTailer.java|borderStyle=solid}
> private void doWork() {
> {
>     .....
>     Thread.sleep(sleepTimeMs);
>     ....
> }
> {code}
> Another example is "dfs.ha.log-roll.rpc.timeout". Right now, we use getInt() 
> to parse the value at runtime in the getActiveNodeProxy() function which is 
> called by doWork(), shown as below. Any erroneous settings (e.g., 
> ill-formatted integer) would cause exceptions.
> {code:title=EditLogTailer.java|borderStyle=solid}
> private NamenodeProtocol getActiveNodeProxy() throws IOException {
> {
>     .....
>     int rpcTimeout = conf.getInt(
>           DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_KEY,
>           DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_DEFAULT);
>     ....
> }
> {code}
> ============================
> Solution (the attached patch)
> -------------------------------------------------
> Basically, the idea of the attached patch is to move the parsing and checking 
> logics into the constructor to expose the error at initialization, so that 
> the errors won't be latent at the runtime (same as YARN-2165 and YARN-2166)
> I'm not aware of the implementation of 2.7.0. It seems there's checking 
> utilities such as the validatePositiveNonZero function in YARN-2165. If so, 
> we can use that one to make the checking more systematic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to