[jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
[ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555404#comment-16555404 ] Dawid Wysakowicz commented on FLINK-6222: - Fixed in master via: 81ff3c45004cd6e3cc5b0f75dc567f5f57d1f1ed 4c4e3d66c9635e523c8ab1539f635ef1dbb24e61 Fixed in 1.6 via: 34641a8987b937a8bfeb7a5e1c0ffa918f9f88d7 2194f473f376e50e2ac354357475f984f120404f > YARN: setting environment variables in an easier fashion > > > Key: FLINK-6222 > URL: https://issues.apache.org/jira/browse/FLINK-6222 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.2.0 > Environment: YARN, EMR >Reporter: Craig Foster >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.6.0, 1.7.0 > > Attachments: patch0-add-yarn-hadoop-conf.diff > > > Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and > sometimes FLINK_CONF_DIR. > For example, in [1], it is stated: > “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR > environment variable to be set to read the YARN and HDFS configuration.” > In BigTop, we set this with /etc/flink/default and then a wrapper is created > to source that. However, this is slightly cumbersome and we don't have a > central place within the Flink project itself to source environment > variables. config.sh could do this but it doesn't have information about > FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution that > would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the > flink-conf.yaml file and then we just symlink /etc/lib/flink/conf/ and > /etc/flink/conf. > But we could also add a flink-env.sh file to set these variables and decouple > them from config.sh entirely. > I'd like to know the opinion/preference of others and what would be more > amenable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
[ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555400#comment-16555400 ] ASF GitHub Bot commented on FLINK-6222: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/6388 > YARN: setting environment variables in an easier fashion > > > Key: FLINK-6222 > URL: https://issues.apache.org/jira/browse/FLINK-6222 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.2.0 > Environment: YARN, EMR >Reporter: Craig Foster >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Attachments: patch0-add-yarn-hadoop-conf.diff > > > Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and > sometimes FLINK_CONF_DIR. > For example, in [1], it is stated: > “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR > environment variable to be set to read the YARN and HDFS configuration.” > In BigTop, we set this with /etc/flink/default and then a wrapper is created > to source that. However, this is slightly cumbersome and we don't have a > central place within the Flink project itself to source environment > variables. config.sh could do this but it doesn't have information about > FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution that > would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the > flink-conf.yaml file and then we just symlink /etc/lib/flink/conf/ and > /etc/flink/conf. > But we could also add a flink-env.sh file to set these variables and decouple > them from config.sh entirely. > I'd like to know the opinion/preference of others and what would be more > amenable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
***UNCHECKED*** [jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
[ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555297#comment-16555297 ] ASF GitHub Bot commented on FLINK-6222: --- Github user dawidwys commented on the issue: https://github.com/apache/flink/pull/6388 Thanks for the review @StefanRRichter and @zentol. Will fix the indentation while merging. > YARN: setting environment variables in an easier fashion > > > Key: FLINK-6222 > URL: https://issues.apache.org/jira/browse/FLINK-6222 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.2.0 > Environment: YARN, EMR >Reporter: Craig Foster >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Attachments: patch0-add-yarn-hadoop-conf.diff > > > Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and > sometimes FLINK_CONF_DIR. > For example, in [1], it is stated: > “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR > environment variable to be set to read the YARN and HDFS configuration.” > In BigTop, we set this with /etc/flink/default and then a wrapper is created > to source that. However, this is slightly cumbersome and we don't have a > central place within the Flink project itself to source environment > variables. config.sh could do this but it doesn't have information about > FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution that > would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the > flink-conf.yaml file and then we just symlink /etc/lib/flink/conf/ and > /etc/flink/conf. > But we could also add a flink-env.sh file to set these variables and decouple > them from config.sh entirely. > I'd like to know the opinion/preference of others and what would be more > amenable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
[ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554376#comment-16554376 ] ASF GitHub Bot commented on FLINK-6222: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6388 LGTM, except the one remaining indentation problem mentioned by @zentol . > YARN: setting environment variables in an easier fashion > > > Key: FLINK-6222 > URL: https://issues.apache.org/jira/browse/FLINK-6222 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.2.0 > Environment: YARN, EMR >Reporter: Craig Foster >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Attachments: patch0-add-yarn-hadoop-conf.diff > > > Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and > sometimes FLINK_CONF_DIR. > For example, in [1], it is stated: > “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR > environment variable to be set to read the YARN and HDFS configuration.” > In BigTop, we set this with /etc/flink/default and then a wrapper is created > to source that. However, this is slightly cumbersome and we don't have a > central place within the Flink project itself to source environment > variables. config.sh could do this but it doesn't have information about > FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution that > would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the > flink-conf.yaml file and then we just symlink /etc/lib/flink/conf/ and > /etc/flink/conf. > But we could also add a flink-env.sh file to set these variables and decouple > them from config.sh entirely. > I'd like to know the opinion/preference of others and what would be more > amenable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
[ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553370#comment-16553370 ] ASF GitHub Bot commented on FLINK-6222: --- Github user dawidwys commented on the issue: https://github.com/apache/flink/pull/6388 In the first place I thought it might make providing the variables a bit more flexible with `flink-env.sh`. In the end it is just a bash script, but the longer I think the more convinced I am it was not the right choice. As you said it does not go well with the other opts, plus we might confuse some users. I will close this PR and open new one with `env.hadoop.home` tomorrow. > YARN: setting environment variables in an easier fashion > > > Key: FLINK-6222 > URL: https://issues.apache.org/jira/browse/FLINK-6222 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.2.0 > Environment: YARN, EMR >Reporter: Craig Foster >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Attachments: patch0-add-yarn-hadoop-conf.diff > > > Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and > sometimes FLINK_CONF_DIR. > For example, in [1], it is stated: > “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR > environment variable to be set to read the YARN and HDFS configuration.” > In BigTop, we set this with /etc/flink/default and then a wrapper is created > to source that. However, this is slightly cumbersome and we don't have a > central place within the Flink project itself to source environment > variables. config.sh could do this but it doesn't have information about > FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution that > would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the > flink-conf.yaml file and then we just symlink /etc/lib/flink/conf/ and > /etc/flink/conf. > But we could also add a flink-env.sh file to set these variables and decouple > them from config.sh entirely. > I'd like to know the opinion/preference of others and what would be more > amenable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
[ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553296#comment-16553296 ] ASF GitHub Bot commented on FLINK-6222: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/6388 If a feature isn't visibly documented chances are no one will use it ;) I'm not sure if the configuration page is the right place to put it, as it so far deals exclusively with settings set in `flink-conf.yaml`. Most notable this line in the introduction sticks out: ``` All configuration is done in conf/flink-conf.yaml, which is expected to be a flat collection of YAML key value pairs with format key: value. ``` You could name it `flink-client-env-sh`, that would make it make it more obvious that it only applies to the client. However i have to ask, why a separate file in the first place? We already have config options for setting environment variables (`env.java.opts`); couldn't we introduce a separate option for clients? > YARN: setting environment variables in an easier fashion > > > Key: FLINK-6222 > URL: https://issues.apache.org/jira/browse/FLINK-6222 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.2.0 > Environment: YARN, EMR >Reporter: Craig Foster >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Attachments: patch0-add-yarn-hadoop-conf.diff > > > Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and > sometimes FLINK_CONF_DIR. > For example, in [1], it is stated: > “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR > environment variable to be set to read the YARN and HDFS configuration.” > In BigTop, we set this with /etc/flink/default and then a wrapper is created > to source that. However, this is slightly cumbersome and we don't have a > central place within the Flink project itself to source environment > variables. config.sh could do this but it doesn't have information about > FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution that > would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the > flink-conf.yaml file and then we just symlink /etc/lib/flink/conf/ and > /etc/flink/conf. > But we could also add a flink-env.sh file to set these variables and decouple > them from config.sh entirely. > I'd like to know the opinion/preference of others and what would be more > amenable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
[ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553015#comment-16553015 ] ASF GitHub Bot commented on FLINK-6222: --- Github user dawidwys commented on the issue: https://github.com/apache/flink/pull/6388 I added a small section. I was thorn a little in the first place if we should make it highly visible in docs as it might be wrongly understood those variables will be distributed to the cluster as well, which is not true. What do you think? > YARN: setting environment variables in an easier fashion > > > Key: FLINK-6222 > URL: https://issues.apache.org/jira/browse/FLINK-6222 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.2.0 > Environment: YARN, EMR >Reporter: Craig Foster >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Attachments: patch0-add-yarn-hadoop-conf.diff > > > Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and > sometimes FLINK_CONF_DIR. > For example, in [1], it is stated: > “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR > environment variable to be set to read the YARN and HDFS configuration.” > In BigTop, we set this with /etc/flink/default and then a wrapper is created > to source that. However, this is slightly cumbersome and we don't have a > central place within the Flink project itself to source environment > variables. config.sh could do this but it doesn't have information about > FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution that > would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the > flink-conf.yaml file and then we just symlink /etc/lib/flink/conf/ and > /etc/flink/conf. > But we could also add a flink-env.sh file to set these variables and decouple > them from config.sh entirely. > I'd like to know the opinion/preference of others and what would be more > amenable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
[ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552416#comment-16552416 ] ASF GitHub Bot commented on FLINK-6222: --- GitHub user dawidwys opened a pull request: https://github.com/apache/flink/pull/6388 [FLINK-6222] Allow passing env variables to start scripts via Added possibility to pass env variables (e.g. HADOOP_CONF_DIR) through flink-env.sh You can merge this pull request into a Git repository by running: $ git pull https://github.com/dawidwys/flink FLINK-6222 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/6388.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6388 commit 7f700d36f2057b38a4ec8873444c3f488447e241 Author: Dawid Wysakowicz Date: 2018-07-19T13:36:56Z [FLINK-6222] Allow passing env variables to start scripts via flink-env.sh > YARN: setting environment variables in an easier fashion > > > Key: FLINK-6222 > URL: https://issues.apache.org/jira/browse/FLINK-6222 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.2.0 > Environment: YARN, EMR >Reporter: Craig Foster >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Attachments: patch0-add-yarn-hadoop-conf.diff > > > Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and > sometimes FLINK_CONF_DIR. > For example, in [1], it is stated: > “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR > environment variable to be set to read the YARN and HDFS configuration.” > In BigTop, we set this with /etc/flink/default and then a wrapper is created > to source that. However, this is slightly cumbersome and we don't have a > central place within the Flink project itself to source environment > variables. config.sh could do this but it doesn't have information about > FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution that > would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the > flink-conf.yaml file and then we just symlink /etc/lib/flink/conf/ and > /etc/flink/conf. > But we could also add a flink-env.sh file to set these variables and decouple > them from config.sh entirely. > I'd like to know the opinion/preference of others and what would be more > amenable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
[ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541513#comment-16541513 ] Dawid Wysakowicz commented on FLINK-6222: - Hi [~foscraig], Personally would be in favor of adding the flink-env.sh script. Would you still like to work on this issue, or can I takeover? > YARN: setting environment variables in an easier fashion > > > Key: FLINK-6222 > URL: https://issues.apache.org/jira/browse/FLINK-6222 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.2.0 > Environment: YARN, EMR >Reporter: Craig Foster >Assignee: Craig Foster >Priority: Major > Attachments: patch0-add-yarn-hadoop-conf.diff > > > Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and > sometimes FLINK_CONF_DIR. > For example, in [1], it is stated: > “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR > environment variable to be set to read the YARN and HDFS configuration.” > In BigTop, we set this with /etc/flink/default and then a wrapper is created > to source that. However, this is slightly cumbersome and we don't have a > central place within the Flink project itself to source environment > variables. config.sh could do this but it doesn't have information about > FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution that > would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the > flink-conf.yaml file and then we just symlink /etc/lib/flink/conf/ and > /etc/flink/conf. > But we could also add a flink-env.sh file to set these variables and decouple > them from config.sh entirely. > I'd like to know the opinion/preference of others and what would be more > amenable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
[ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525901#comment-16525901 ] Hequn Cheng commented on FLINK-6222: Hi, [~foscraig] Thanks for your patch. It is recommended to make a pull request on github. You can check more from [here|https://flink.apache.org/contribute-code.html]. Pull request is more convenience to be reviewed. Thanks, Hequn. > YARN: setting environment variables in an easier fashion > > > Key: FLINK-6222 > URL: https://issues.apache.org/jira/browse/FLINK-6222 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.2.0 > Environment: YARN, EMR >Reporter: Craig Foster >Priority: Major > Attachments: patch0-add-yarn-hadoop-conf.diff > > > Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and > sometimes FLINK_CONF_DIR. > For example, in [1], it is stated: > “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR > environment variable to be set to read the YARN and HDFS configuration.” > In BigTop, we set this with /etc/flink/default and then a wrapper is created > to source that. However, this is slightly cumbersome and we don't have a > central place within the Flink project itself to source environment > variables. config.sh could do this but it doesn't have information about > FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution that > would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the > flink-conf.yaml file and then we just symlink /etc/lib/flink/conf/ and > /etc/flink/conf. > But we could also add a flink-env.sh file to set these variables and decouple > them from config.sh entirely. > I'd like to know the opinion/preference of others and what would be more > amenable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
[ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257610#comment-16257610 ] Craig Foster commented on FLINK-6222: - I'm submitting a patch here to see if we can move this along. > YARN: setting environment variables in an easier fashion > > > Key: FLINK-6222 > URL: https://issues.apache.org/jira/browse/FLINK-6222 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.2.0 > Environment: YARN, EMR >Reporter: Craig Foster > > Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and > sometimes FLINK_CONF_DIR. > For example, in [1], it is stated: > “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR > environment variable to be set to read the YARN and HDFS configuration.” > In BigTop, we set this with /etc/flink/default and then a wrapper is created > to source that. However, this is slightly cumbersome and we don't have a > central place within the Flink project itself to source environment > variables. config.sh could do this but it doesn't have information about > FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution that > would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the > flink-conf.yaml file and then we just symlink /etc/lib/flink/conf/ and > /etc/flink/conf. > But we could also add a flink-env.sh file to set these variables and decouple > them from config.sh entirely. > I'd like to know the opinion/preference of others and what would be more > amenable. -- This message was sent by Atlassian JIRA (v6.4.14#64029)