[
https://issues.apache.org/jira/browse/HADOOP-19838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhaobo Huang updated HADOOP-19838:
----------------------------------
Description:
#### Problem
Currently, Hadoop shell startup scripts only append **"HADOOP_CLIENT_OPTS"** as
JVM arguments before the main class. This prevents users from transparently
configuring client-side generic config (such as fs.defaultFS,
dfs.client.socket-timeout, dfs.replication) via environment variables.
The examples provided in the doc at this link
(https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/UnixShellGuide.html#HADOOP_CLIENT_OPTS)
fail to take effect in actual practice.
These generic options must be parsed by GenericOptionsParser class, which only
processes arguments passed after the main class. As a result, users have to
manually add -D/-conf parameters to every single command execution, which is
inconvenient and breaks the expected transparent configuration experience, for
example:
```shell
hadoop fs -Dfs.defaultFS=hdfs://127.0.0.1:8020/ -ls -d /
```
#### Solution
Support parsing configuration items from **"HADOOP_CLIENT_OPTS"** in the
Configuration class. The keys used here are exactly the same as those in Hadoop
configuration.
This patch also adds support for parsing Java system properties with a specific
prefix: **"hadoop.property."**
Only system properties prefixed with hadoop.property. are scanned. The
substring after the prefix is used as the actual configuration key.
#### Benefits
For AI algorithm training scenarios, tuning parameters is required to address
training efficiency. With this feature enabled, users no longer need to modify
any related code.
> Support parsing environment variables and system properties in the
> Configuration class.
> ---------------------------------------------------------------------------------------
>
> Key: HADOOP-19838
> URL: https://issues.apache.org/jira/browse/HADOOP-19838
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Zhaobo Huang
> Assignee: Zhaobo Huang
> Priority: Major
> Labels: pull-request-available
>
> #### Problem
> Currently, Hadoop shell startup scripts only append **"HADOOP_CLIENT_OPTS"**
> as JVM arguments before the main class. This prevents users from
> transparently configuring client-side generic config (such as fs.defaultFS,
> dfs.client.socket-timeout, dfs.replication) via environment variables.
> The examples provided in the doc at this link
> (https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/UnixShellGuide.html#HADOOP_CLIENT_OPTS)
> fail to take effect in actual practice.
> These generic options must be parsed by GenericOptionsParser class, which
> only processes arguments passed after the main class. As a result, users have
> to manually add -D/-conf parameters to every single command execution, which
> is inconvenient and breaks the expected transparent configuration experience,
> for example:
> ```shell
> hadoop fs -Dfs.defaultFS=hdfs://127.0.0.1:8020/ -ls -d /
> ```
> #### Solution
> Support parsing configuration items from **"HADOOP_CLIENT_OPTS"** in the
> Configuration class. The keys used here are exactly the same as those in
> Hadoop configuration.
> This patch also adds support for parsing Java system properties with a
> specific prefix: **"hadoop.property."**
> Only system properties prefixed with hadoop.property. are scanned. The
> substring after the prefix is used as the actual configuration key.
> #### Benefits
> For AI algorithm training scenarios, tuning parameters is required to address
> training efficiency. With this feature enabled, users no longer need to
> modify any related code.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]