Tianyin Xu created HDFS-7726:
--------------------------------

             Summary: Parse and check the configuration settings of edit log to 
prevent runtime errors
                 Key: HDFS-7726
                 URL: https://issues.apache.org/jira/browse/HDFS-7726
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: namenode
    Affects Versions: 2.6.0
            Reporter: Tianyin Xu
            Priority: Minor


============================
Problem
-------------------------------------------------

Similar as the following two issues addressed in 2.7.0,
https://issues.apache.org/jira/browse/YARN-2165
https://issues.apache.org/jira/browse/YARN-2166

The edit log related configuration settings should be checked in the 
constructor rather than being applied directly at runtime. This would cause 
runtime failures if the values are wrong.

Take "dfs.ha.tail-edits.period" as an example, currently in EditLogTailer.java, 
its value is not checked but directly used in doWork(), as the following code 
snippets. Any negative values would cause IllegalArgumentException (which is 
not caught) and impair the component. 

{code:title=EditLogTailer.java|borderStyle=solid}
private void doWork() {
{
    .....
    Thread.sleep(sleepTimeMs);
    ....
}
{code}

Another example is "dfs.ha.log-roll.rpc.timeout". Right now, we use getInt() to 
parse the value at runtime in the getActiveNodeProxy() function which is called 
by doWork(), shown as below. Any erroneous settings (e.g., ill-formatted 
integer) would cause exceptions.

{code:title=EditLogTailer.java|borderStyle=solid}
private NamenodeProtocol getActiveNodeProxy() throws IOException {
{
    .....
    int rpcTimeout = conf.getInt(
          DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_KEY,
          DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_DEFAULT);
    ....
}
{code}

============================
Solution (the attached patch)
-------------------------------------------------

Basically, the idea of the attached patch is to move the parsing and checking 
logics into the constructor to expose the error at initialization, so that the 
errors won't be latent at the runtime (same as YARN-2165 and YARN-2166)

I'm not aware of the implementation of 2.7.0. It seems there's checking 
utilities such as the validatePositiveNonZero function in YARN-2165. If so, we 
can use that one to make the checking more systematic.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to