[
https://issues.apache.org/jira/browse/HADOOP-14976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217705#comment-16217705
]
Allen Wittenauer edited comment on HADOOP-14976 at 10/24/17 9:10 PM:
---------------------------------------------------------------------
bq. since the calling script always knows what is necessary?
I'd need to be convinced this is true. A lot of the work done in the shell
script rewrite and follow on work was to make the "front end" scripts as dumb
as possible in order to centralize the program logic. This gave huge benefits
in the form of script consistency, testing, and more.
Besides, EXECNAME is used for *very* specific things:
e.g.:
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L67
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh#L20
are great examples where the execname is exactly what needs to be reported.
.. and that's even before 3rd party add-ons that might expect
HADOOP_SHELL_EXECNAME to work as expected.
If distributions really are renaming the scripts (which is extremely
problematic for lots of reasons), there isn't much of a reason they couldn't
just tuck them away in a non-PATH directory and use the same names or even just
rewrite the scripts directly. (See above about removing as much logic as
possible.)
I've had in my head a "vendor" version of hadoop-user-function.sh, but I'm not
sure if even that would help here. It really depends upon the why the bin
scripts are getting renamed, if the problem being solved is actually more
appropriate for hadoop-layout.sh, etc.
I see nothing but pain and misfortune for mucking with HADOOP_SHELL_EXECNAME
though.
was (Author: aw):
bq. since the calling script always knows what is necessary?
I'd need to be convinced this is true. A lot of the work done in the shell
script rewrite and follow on work was to make the "front end" scripts as dumb
as possible in order to centralize the program logic. This gave huge benefits
in the form of script consistency, testing, and more.
Besides, CLASSNAME and EXECNAME are used for *very* different things and aren't
guaranteed to match.
e.g.:
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L67
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh#L20
are great examples where the execname is exactly what needs to be reported.
.. and that's even before 3rd party add-ons that might expect
HADOOP_SHELL_EXECNAME to work as expected.
If distributions really are renaming the scripts (which is extremely
problematic for lots of reasons), there isn't much of a reason they couldn't
just tuck them away in a non-PATH directory and use the same names or even just
rewrite the scripts directly. (See above about removing as much logic as
possible.)
I've had in my head a "vendor" version of hadoop-user-function.sh, but I'm not
sure if even that would help here. It really depends upon the why the bin
scripts are getting renamed, if the problem being solved is actually more
appropriate for hadoop-layout.sh, etc.
I see nothing but pain and misfortune for mucking with HADOOP_SHELL_EXECNAME
though.
> Allow overriding HADOOP_SHELL_EXECNAME
> --------------------------------------
>
> Key: HADOOP-14976
> URL: https://issues.apache.org/jira/browse/HADOOP-14976
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Arpit Agarwal
>
> Some Hadoop shell scripts infer their own name using this bit of shell magic:
> {code}
> 18 MYNAME="${BASH_SOURCE-$0}"
> 19 HADOOP_SHELL_EXECNAME="${MYNAME##*/}"
> {code}
> e.g. see the
> [hdfs|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L18]
> script.
> The inferred shell script name is later passed to _hadoop-functions.sh_ which
> uses it to construct the names of some environment variables. E.g. when
> invoking _hdfs datanode_, the options variable name is inferred as follows:
> {code}
> # HDFS + DATANODE + OPTS -> HDFS_DATANODE_OPTS
> {code}
> This works well if the calling script name is standard {{hdfs}} or {{yarn}}.
> If a distribution renames the script to something like foo.bar, , then the
> variable names will be inferred as {{FOO.BAR_DATANODE_OPTS}}. This is not a
> valid bash variable name.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]