[
https://issues.apache.org/jira/browse/HADOOP-19838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhaobo Huang updated HADOOP-19838:
----------------------------------
Description:
Problem
Currently, Hadoop shell startup scripts only append *"HADOOP_CLIENT_OPTS"* as
JVM arguments before the main class. This prevents users from transparently
configuring client-side generic config (such as fs.defaultFS,
dfs.client.socket-timeout, dfs.replication) via environment variables.
The examples provided in the doc at this link
(https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/UnixShellGuide.html#HADOOP_CLIENT_OPTS)
fail to take effect in actual practice.
These generic options must be parsed by GenericOptionsParser class, which only
processes arguments passed after the main class. As a result, users have to
manually add -D/-conf parameters to every single command execution, which is
inconvenient and breaks the expected transparent configuration experience, for
example:
{code:java}
hadoop fs -Dfs.defaultFS=hdfs://127.0.0.1:8020/ -ls -d /
{code}
Solution
Support parsing configuration items from *"HADOOP_CLIENT_OPTS"* in the
Configuration class. The keys used here are exactly the same as those in Hadoop
configuration.
This patch also adds support for parsing Java system properties with a specific
prefix: *"hadoop.property."*.
Only system properties prefixed with *hadoop.property.* are scanned. The
substring after the prefix is used as the actual configuration key.
Benefits
For AI algorithm training scenarios, tuning parameters is required to address
training efficiency. With this feature enabled, users no longer need to modify
any related code.
was:
Problem
Currently, Hadoop shell startup scripts only append *"HADOOP_CLIENT_OPTS"* as
JVM arguments before the main class. This prevents users from transparently
configuring client-side generic config (such as fs.defaultFS,
dfs.client.socket-timeout, dfs.replication) via environment variables.
The examples provided in the doc at this link
(https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/UnixShellGuide.html#HADOOP_CLIENT_OPTS)
fail to take effect in actual practice.
These generic options must be parsed by GenericOptionsParser class, which only
processes arguments passed after the main class. As a result, users have to
manually add -D/-conf parameters to every single command execution, which is
inconvenient and breaks the expected transparent configuration experience, for
example:
{code:java}
hadoop fs -Dfs.defaultFS=hdfs://127.0.0.1:8020/ -ls -d /
{code}
Solution
Support parsing configuration items from *"HADOOP_CLIENT_OPTS"* in the
Configuration class. The keys used here are exactly the same as those in Hadoop
configuration.
This patch also adds support for parsing Java system properties with a specific
prefix: *"hadoop.property."*.
Only system properties prefixed with hadoop.property. are scanned. The
substring after the prefix is used as the actual configuration key.
Benefits
For AI algorithm training scenarios, tuning parameters is required to address
training efficiency. With this feature enabled, users no longer need to modify
any related code.
> Support parsing environment variables and system properties in the
> Configuration class.
> ---------------------------------------------------------------------------------------
>
> Key: HADOOP-19838
> URL: https://issues.apache.org/jira/browse/HADOOP-19838
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Zhaobo Huang
> Assignee: Zhaobo Huang
> Priority: Major
> Labels: pull-request-available
>
> Problem
> Currently, Hadoop shell startup scripts only append *"HADOOP_CLIENT_OPTS"* as
> JVM arguments before the main class. This prevents users from transparently
> configuring client-side generic config (such as fs.defaultFS,
> dfs.client.socket-timeout, dfs.replication) via environment variables.
> The examples provided in the doc at this link
> (https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/UnixShellGuide.html#HADOOP_CLIENT_OPTS)
> fail to take effect in actual practice.
> These generic options must be parsed by GenericOptionsParser class, which
> only processes arguments passed after the main class. As a result, users have
> to manually add -D/-conf parameters to every single command execution, which
> is inconvenient and breaks the expected transparent configuration experience,
> for example:
> {code:java}
> hadoop fs -Dfs.defaultFS=hdfs://127.0.0.1:8020/ -ls -d /
> {code}
> Solution
> Support parsing configuration items from *"HADOOP_CLIENT_OPTS"* in the
> Configuration class. The keys used here are exactly the same as those in
> Hadoop configuration.
> This patch also adds support for parsing Java system properties with a
> specific prefix: *"hadoop.property."*.
> Only system properties prefixed with *hadoop.property.* are scanned. The
> substring after the prefix is used as the actual configuration key.
> Benefits
> For AI algorithm training scenarios, tuning parameters is required to address
> training efficiency. With this feature enabled, users no longer need to
> modify any related code.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]