[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9902:
-------------------------------------

    Release Note: 
The Hadoop shell scripts have been rewritten to fix many long standing bugs and 
include some new features.  While an eye has been kept towards compatibility, 
some changes may break existing installations.

INCOMPATIBLE CHANGES:

* The pid files for secure daemons have been renamed to include the appropriate 
$HADOOP_IDENT_STR.  This should allow, with proper configurations in place, for 
multiple versions of the same secure daemon to run on a host.
* All YARN_* and MAPRED_* environment variables act as overrides to their 
equivalent HADOOP_* environment variables when 'yarn', 'mapred' and related 
commands are executed. Previously, these were separated which meant duplication 
of common settings.  
* All Hadoop shell script subsystems execute hadoop-env.sh, which allows for 
all of the environment variables to be in one location.  This was not the case 
previously.
* hdfs-config.sh and hdfs-config.cmd were inadvertently duplicated in two 
different locations.  The sbin version has been removed.
* The log4j settings forcibly set by some *-daemon.sh commands has been 
removed.  This is now configurable in the *-env.sh files.  Users who do not 
have these set will see logs going in odd places.
* Support for various undocumentented YARN log4j.properties files has been 
removed.
* Support for $HADOOP_MASTER and the rsync code have been removed.
* yarn.id.str has been removed.
* We now require bash v3 or better.


BUG FIXES:

* HADOOP_CONF_DIR is now properly honored.
* Documented hadoop-layout.sh.
* Shell commands should now work properly when called as a relative path.
* Operations which trigger ssh will now limit how many connections run in 
parallel to 10 to prevent memory and network exhaustion.
* HADOOP_CLIENT_OPTS support has been added to a few more commands.
* Various options on hadoop command lines were supported inconsistently.  These 
have been unified into hadoop-config.sh. --config still needs to come first, 
however.
* ulimit logging for secure daemons no longer assumes /bin/bash but does assume 
bash on the command line path.
* Removed references to some Yahoo! specific paths.


IMPROVEMENTS:

* Significant amounts of redundant code have been moved into a new file called 
hadoop-functions.sh.
* Improved information in *-env.sh on what can be set, ramifications of 
setting, etc.
* There is an attempt to do some trivial deduplication of the classpath and JVM 
options.  This allows, amongst other things, for custom settings in *_OPTS for 
Hadoop daemons to override defaults and other generic settings (i.e., 
$HADOOP_OPTS).  This is particularly relevant for Xmx settings, as one can now 
set them in _OPTS and ignore the heap specific options for daemons which force 
the size in megabytes.
* Operations which trigger ssh connections can now use pdsh if installed.  
$HADOOP_SSH_OPTS still gets applied. 
* Subcommands have been alphabetized in both usage and in the code.
* All/most of the functionality provided by the sbin/* commands has been moved 
to either their bin/ equivalents or made into functions.  The rewritten 
versions of these commands are now wrappers to maintain backward compatibility. 
Of particular note is the new --daemon option present in some bin/ commands 
which allow certain subcommands to be daemonized.
* It is now possible to override some of the shell code capabilities to provide 
site specific functionality.
* A new option called --buildpaths will attempt to add developer build 
directories to the classpath to allow for in source tree testing.
* If a usage function is defined, -h, -help, and --help will all trigger a help 
message.
* Several generic environment variables have been added to provide a common 
configuration for pids, logs, and their security equivalents.  The older 
versions still act as overrides to these generic versions.
* Groundwork has been laid to allow for custom secure daemon setup using 
something other than jsvc.

  was:
The Hadoop shell scripts have been rewritten to fix many long standing bugs and 
include some new features.  While an eye has been kept towards compatibility, 
some changes may break existing installations.

INCOMPATIBLE CHANGES:

* The pid files for secure daemons have been renamed to include the appropriate 
$HADOOP_IDENT_STR.  This should allow, with proper configurations in place, for 
multiple versions of the same secure daemon to run on a host.
* All YARN_* and MAPRED_* environment variables act as overrides to their 
equivalent HADOOP_* environment variables when 'yarn', 'mapred' and related 
commands are executed. Previously, these were separated which meant duplication 
of common settings.  
* All Hadoop shell script subsystems execute hadoop-env.sh, which allows for 
all of the environment variables to be in one location.  This was not the case 
previously.
* hdfs-config.sh and hdfs-config.cmd were inadvertently duplicated in two 
different locations.  The sbin version has been removed.
* The log4j settings forcibly set by some *-daemon.sh commands has been 
removed.  This is now configurable in the *-env.sh files.  Users who do not 
have these set will see logs going in odd places.
* Support for various undocumentented YARN log4j.properties files has been 
removed.
* Support for $HADOOP_MASTER and the rsync code has been removed.
* yarn.id.str has been removed.
* We now require bash v3 or better.


BUG FIXES:

* HADOOP_CONF_DIR is now properly honored.
* Documented hadoop-layout.sh
* Shell commands should now work properly when called as a relative path.
* Operations which trigger ssh will now limit how many connections run in 
parallel to 10 to prevent memory and network exhaustion.
* HADOOP_CLIENT_OPTS support has been added to a few more commands.
* Various options on hadoop comamnd lines were supported inconsistently.  These 
have been unified into hadoop-config.sh. --config still needs to come first, 
however.
* ulimit logging for secure daemons no longer assumes /bin/bash but does assume 
bash on the command line path.
* Removed references to some Yahoo! specific paths.


IMPROVEMENTS:

* Significant amounts of redundant code has been moved into a new file called 
hadoop-functions.sh.
* Improved information in *-env.sh on what can be set, ramifications of 
setting, etc.
* There is an attempt to do some trivial deduplication of the classpath and JVM 
options.  This allows, amongst other things, for custom settings in *_OPTS for 
Hadoop daemons to override defaults and other generic settings (i.e., 
$HADOOP_OPTS).  This is particularly relevant for Xmx settings, as one can now 
set them in _OPTS and ignore the heap specific options for daemons which force 
the size in megabytes.
* Operations which trigger ssh connections can now use pdsh if installed.  
$HADOOP_SSH_OPTS still gets applied. 
* Subcommands have been alphabetized in both usage and in the code.
* All/most of the functionality provided by the sbin/* commands has been moved 
to either their bin/ equivalents or made into functions.  The rewritten 
versions of these commands are now wrappers to maintain backward compatibility. 
Of particular note is the new --daemon option present in some bin/ commands 
which allow certain subcommands to be daemonized.
* It is now possible to override some of the shell code capabilities to provide 
site specific functionality.
* A new option called --buildpaths will attempt to add developer build 
directories to the classpath to allow for in source tree testing.
* If a usage function is defined, -h, -help, and --help will all trigger a help 
message.
* Several generic environment variables have been added to provide a common 
configuration for pids, logs, and their security equivalents.  The older 
versions still act as overrides to these generic versions.
* Groundwork has been laid to allow for custom secure daemon setup using 
something other than jsvc.


> Shell script rewrite
> --------------------
>
>                 Key: HADOOP-9902
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9902
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: scripts
>    Affects Versions: 3.0.0
>            Reporter: Allen Wittenauer
>            Assignee: Allen Wittenauer
>              Labels: releasenotes
>         Attachments: HADOOP-9902-2.patch, HADOOP-9902.patch, HADOOP-9902.txt, 
> hadoop-9902-1.patch, more-info.txt
>
>
> Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to