Tianyin Xu created HDFS-7726:
--------------------------------
Summary: Parse and check the configuration settings of edit log to
prevent runtime errors
Key: HDFS-7726
URL: https://issues.apache.org/jira/browse/HDFS-7726
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 2.6.0
Reporter: Tianyin Xu
Priority: Minor
============================
Problem
-------------------------------------------------
Similar as the following two issues addressed in 2.7.0,
https://issues.apache.org/jira/browse/YARN-2165
https://issues.apache.org/jira/browse/YARN-2166
The edit log related configuration settings should be checked in the
constructor rather than being applied directly at runtime. This would cause
runtime failures if the values are wrong.
Take "dfs.ha.tail-edits.period" as an example, currently in EditLogTailer.java,
its value is not checked but directly used in doWork(), as the following code
snippets. Any negative values would cause IllegalArgumentException (which is
not caught) and impair the component.
{code:title=EditLogTailer.java|borderStyle=solid}
private void doWork() {
{
.....
Thread.sleep(sleepTimeMs);
....
}
{code}
Another example is "dfs.ha.log-roll.rpc.timeout". Right now, we use getInt() to
parse the value at runtime in the getActiveNodeProxy() function which is called
by doWork(), shown as below. Any erroneous settings (e.g., ill-formatted
integer) would cause exceptions.
{code:title=EditLogTailer.java|borderStyle=solid}
private NamenodeProtocol getActiveNodeProxy() throws IOException {
{
.....
int rpcTimeout = conf.getInt(
DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_KEY,
DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_DEFAULT);
....
}
{code}
============================
Solution (the attached patch)
-------------------------------------------------
Basically, the idea of the attached patch is to move the parsing and checking
logics into the constructor to expose the error at initialization, so that the
errors won't be latent at the runtime (same as YARN-2165 and YARN-2166)
I'm not aware of the implementation of 2.7.0. It seems there's checking
utilities such as the validatePositiveNonZero function in YARN-2165. If so, we
can use that one to make the checking more systematic.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)