[ 
https://issues.apache.org/jira/browse/HADOOP-19838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhaobo Huang updated HADOOP-19838:
----------------------------------
    Description: 
Problem

Currently, Hadoop shell startup scripts only append *"HADOOP_CLIENT_OPTS"* as 
JVM arguments before the main class. This prevents users from transparently 
configuring client-side generic config (such as fs.defaultFS, 
dfs.client.socket-timeout, dfs.replication) via environment variables.

The examples provided in the doc at this link 
(https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/UnixShellGuide.html#HADOOP_CLIENT_OPTS)
 fail to take effect in actual practice.

These generic options must be parsed by GenericOptionsParser class, which only 
processes arguments passed after the main class. As a result, users have to 
manually add -D/-conf parameters to every single command execution, which is 
inconvenient and breaks the expected transparent configuration experience, for 
example:

{code:java}
hadoop fs -Dfs.defaultFS=hdfs://127.0.0.1:8020/ -ls -d /
{code}

Solution
Support parsing configuration items from *"HADOOP_CLIENT_OPTS"* in the 
Configuration class. The keys used here are exactly the same as those in Hadoop 
configuration.
This patch also adds support for parsing Java system properties with a specific 
prefix: *"hadoop.property."*.
Only system properties prefixed with *hadoop.property.* are scanned. The 
substring after the prefix is used as the actual configuration key.

Benefits
For AI algorithm training scenarios, tuning parameters is required to address 
training efficiency. With this feature enabled, users no longer need to modify 
any related code.

  was:
Problem

Currently, Hadoop shell startup scripts only append *"HADOOP_CLIENT_OPTS"* as 
JVM arguments before the main class. This prevents users from transparently 
configuring client-side generic config (such as fs.defaultFS, 
dfs.client.socket-timeout, dfs.replication) via environment variables.

The examples provided in the doc at this link 
(https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/UnixShellGuide.html#HADOOP_CLIENT_OPTS)
 fail to take effect in actual practice.

These generic options must be parsed by GenericOptionsParser class, which only 
processes arguments passed after the main class. As a result, users have to 
manually add -D/-conf parameters to every single command execution, which is 
inconvenient and breaks the expected transparent configuration experience, for 
example:

{code:java}
hadoop fs -Dfs.defaultFS=hdfs://127.0.0.1:8020/ -ls -d /
{code}

Solution
Support parsing configuration items from *"HADOOP_CLIENT_OPTS"* in the 
Configuration class. The keys used here are exactly the same as those in Hadoop 
configuration.
This patch also adds support for parsing Java system properties with a specific 
prefix: *"hadoop.property."*.
Only system properties prefixed with hadoop.property. are scanned. The 
substring after the prefix is used as the actual configuration key.

Benefits
For AI algorithm training scenarios, tuning parameters is required to address 
training efficiency. With this feature enabled, users no longer need to modify 
any related code.


> Support parsing environment variables and system properties in the 
> Configuration class.
> ---------------------------------------------------------------------------------------
>
>                 Key: HADOOP-19838
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19838
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Zhaobo Huang
>            Assignee: Zhaobo Huang
>            Priority: Major
>              Labels: pull-request-available
>
> Problem
> Currently, Hadoop shell startup scripts only append *"HADOOP_CLIENT_OPTS"* as 
> JVM arguments before the main class. This prevents users from transparently 
> configuring client-side generic config (such as fs.defaultFS, 
> dfs.client.socket-timeout, dfs.replication) via environment variables.
> The examples provided in the doc at this link 
> (https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/UnixShellGuide.html#HADOOP_CLIENT_OPTS)
>  fail to take effect in actual practice.
> These generic options must be parsed by GenericOptionsParser class, which 
> only processes arguments passed after the main class. As a result, users have 
> to manually add -D/-conf parameters to every single command execution, which 
> is inconvenient and breaks the expected transparent configuration experience, 
> for example:
> {code:java}
> hadoop fs -Dfs.defaultFS=hdfs://127.0.0.1:8020/ -ls -d /
> {code}
> Solution
> Support parsing configuration items from *"HADOOP_CLIENT_OPTS"* in the 
> Configuration class. The keys used here are exactly the same as those in 
> Hadoop configuration.
> This patch also adds support for parsing Java system properties with a 
> specific prefix: *"hadoop.property."*.
> Only system properties prefixed with *hadoop.property.* are scanned. The 
> substring after the prefix is used as the actual configuration key.
> Benefits
> For AI algorithm training scenarios, tuning parameters is required to address 
> training efficiency. With this feature enabled, users no longer need to 
> modify any related code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to