[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592297#comment-16592297 ] ASF GitHub Bot commented on FLINK-7477: --- ruankd commented on issue #4566: [FLINK-7477] [FLINK-7480] Various improvements to Flink scripts URL: https://github.com/apache/flink/pull/4566#issuecomment-415904348 Hey, notice that this [commit](https://github.com/apache/flink/commit/0a0f6ed6c3d6cff702e4322293340274bea5e7d9) is part of this PR but it not merged into branch 1.5 and 1.6, neither in master. I wonder whether it will be merged? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Major > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488213#comment-16488213 ] Keda Ruan commented on FLINK-7477: -- Hey, just curious whether this commit will be merged into 1.5.x release? Thanks. > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Major > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377679#comment-16377679 ] Ken Krugler commented on FLINK-7477: The odd stuff (some of which might be bogus)... # I had to explicitly add {{kryo.serializers}} as a dependency. # Ditto for {{org.jdom:jdom}}, which our {{Tika}} dependency should have pulled in transitively, but it was missing. # A bunch of stuff with getting integration tests working (including {{maven-failsafe-plugin}} and {{build-helper-maven-plugin}} among others), but that just happened to be at the same time as the AWS client class issue, so unrelated. Not sure how different our (non-flink) shaded exclusion list wound up being from "regular" Flink, here's what it is now: {code:java} log4j:log4j org.scala-lang:scala-library org.scala-lang:scala-compiler org.scala-lang:scala-reflect com.data-artisans:flakka-actor_* com.data-artisans:flakka-remote_* com.data-artisans:flakka-slf4j_* io.netty:netty-all io.netty:netty commons-fileupload:commons-fileupload org.apache.avro:avro commons-collections:commons-collections org.codehaus.jackson:jackson-core-asl org.codehaus.jackson:jackson-mapper-asl com.thoughtworks.paranamer:paranamer org.xerial.snappy:snappy-java org.apache.commons:commons-compress org.tukaani:xz com.esotericsoftware.kryo:kryo com.esotericsoftware.minlog:minlog org.objenesis:objenesis com.twitter:chill_* com.twitter:chill-java commons-lang:commons-lang junit:junit org.apache.commons:commons-lang3 org.slf4j:slf4j-api org.slf4j:slf4j-log4j12 log4j:log4j org.apache.commons:commons-math org.apache.sling:org.apache.sling.commons.json commons-logging:commons-logging commons-codec:commons-codec stax:stax-api com.typesafe:config org.uncommons.maths:uncommons-maths com.github.scopt:scopt_* commons-io:commons-io commons-cli:commons-cli {code} > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Major > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368926#comment-16368926 ] Aljoscha Krettek commented on FLINK-7477: - Could you maybe comment on that magic? > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Major > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368677#comment-16368677 ] Ken Krugler commented on FLINK-7477: It works (at least when running with YARN via EMR). I believe that's because the version of Hadoop on the EMR master matches what we're running against; on my machine, I have to switch between multiple versions of Hadoop for various (consulting) clients who are on different versions of Hadoop, and my {{hadoop}} symlink wound up pointing to a different version of Hadoop than what Flink was using. Related note - the 1.4 release fixed some shading issues we were running into with AWS client classes (mostly around {{HttpCore}} stuff), but to get everything working properly I felt like I did some voodoo with class exclusions in the {{maven-shade-plugin}} section of my {{pom.xml}}, which still feels fragile. > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Major > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368202#comment-16368202 ] Aljoscha Krettek commented on FLINK-7477: - [~kkrugler] So on YARN, the current setup works for you or do you also have to remove the {{hadoop classpath}} parts from the scripts to make it work? > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Major > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367750#comment-16367750 ] Ken Krugler commented on FLINK-7477: Hi [~aljoscha] - I encountered this issue when running locally (using {{bin/start-local.sh}}). And yes, on YARN I would expect that the Hadoop jars are added to the classpath on the nodes. The challenge comes from code that executes as part of creating/submitting the job, where it also needs Hadoop (or AWS) support, but you don't want to include those jars in the uber jar for obvious reasons. In that case ensuring the Hadoop/etc jars are on the classpath when main() is executing, _and_ they match the version being used by YARN, is critical and is a common source of problems (for Flink and regular Hadoop jobs). > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Major > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366735#comment-16366735 ] Aljoscha Krettek commented on FLINK-7477: - I created FLINK-8668 > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Major > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366733#comment-16366733 ] Aljoscha Krettek commented on FLINK-7477: - [~kkrugler] I added this code but now I would be in favour of just removing it, along with the code that uses {{hbase classpath}}. Out of curiosity, are you running on YARN? Shouldn't this also include the Hadoop dependencies in your classpath anyways when executing the TaskManagers on YARN? > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Major > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347685#comment-16347685 ] Ken Krugler commented on FLINK-7477: I posted to the mailing list about an issue that this change seemed to create for me, but didn't hear back. {quote}With Flink 1.4 and FLINK-7477, I ran into a problem with jar versions for HttpCore, when using the AWS SDK to read from S3. I believe the issue is that even when setting classloader.resolve-order to child-first in flink-conf.yaml, the change to put all jars returned by “hadoop classpath” on the classpath means that classes in these jars are found before the classes in my shaded Flink uber jar. If I ensure that I don’t have the “hadoop” command set up on my Bash path, then I don’t run into this issue. Does this make sense, or is there something else going on that I can fix to avoid this situation?{quote} Any input? Thanks...Ken > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Major > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135189#comment-16135189 ] ASF GitHub Bot commented on FLINK-7477: --- Github user aljoscha closed the pull request at: https://github.com/apache/flink/pull/4566 > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134547#comment-16134547 ] ASF GitHub Bot commented on FLINK-7477: --- Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/4566 It's a bash command, AFAIK. > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134351#comment-16134351 ] ASF GitHub Bot commented on FLINK-7477: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/4566 +1 to merge (assuming `command` is available on all operating systems / or is a bash command?) > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16133174#comment-16133174 ] ASF GitHub Bot commented on FLINK-7477: --- Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/4566 @rmetzger I addressed your comments. > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16133120#comment-16133120 ] ASF GitHub Bot commented on FLINK-7477: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/4566#discussion_r133985865 --- Diff: flink-dist/src/main/flink-bin/bin/config.sh --- @@ -351,8 +351,20 @@ if [ -z "$HADOOP_CONF_DIR" ]; then fi fi +# try and set HADOOP_CONF_DIR to some common default if it's not set +if [ -z "$HADOOP_CONF_DIR" ]; then +if [ -d "/etc/hadoop/conf" ]; then +HADOOP_CONF_DIR="/etc/hadoop/conf" +fi +fi + INTERNAL_HADOOP_CLASSPATHS="${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}" +# check if the "hadoop" binary is available, if yes, use that to augment the CLASSPATH +if command -v hadoop >/dev/null 2>&1; then + INTERNAL_HADOOP_CLASSPATHS="${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:`hadoop classpath`" --- End diff -- I would actually append `INTERNAL_HADOOP_CLASSPATHS` instead of overwriting it. > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16133119#comment-16133119 ] ASF GitHub Bot commented on FLINK-7477: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/4566#discussion_r133985612 --- Diff: flink-dist/src/main/flink-bin/bin/config.sh --- @@ -351,8 +351,20 @@ if [ -z "$HADOOP_CONF_DIR" ]; then fi fi +# try and set HADOOP_CONF_DIR to some common default if it's not set +if [ -z "$HADOOP_CONF_DIR" ]; then +if [ -d "/etc/hadoop/conf" ]; then +HADOOP_CONF_DIR="/etc/hadoop/conf" --- End diff -- I would suggest to print a message to the user that we are using this HADOOP_CONF_DIR > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available
[ https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16133044#comment-16133044 ] ASF GitHub Bot commented on FLINK-7477: --- GitHub user aljoscha opened a pull request: https://github.com/apache/flink/pull/4566 [FLINK-7477] [FLINK-7480] Various improvements to Flink scripts You can merge this pull request into a Git repository by running: $ git pull https://github.com/aljoscha/flink hadoop-env-improvements Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4566.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4566 commit 6b4d7e5e09dcd913fbb9c84c59fc8a10e6c662cc Author: Aljoscha KrettekDate: 2017-08-18T14:39:41Z [FLINK-7477] Use "hadoop classpath" to augment classpath when available This improves the out-of-box experience on GCE and AWS, both of which don't set a HADOOP_CLASSPATH but have "hadoop" available on the $PATH. commit f63e2d03d739014f0cd94634d731e552a02c76d9 Author: Aljoscha Krettek Date: 2017-08-18T14:40:55Z [FLINK-7480] Set HADOOP_CONF_DIR to sane default if not set This improves the out-of-box experience on GCE and AWS, both of which don't set HADOOP_CONF_DIR by default but use /etc/hadoop/conf > Use "hadoop classpath" to augment classpath when available > -- > > Key: FLINK-7477 > URL: https://issues.apache.org/jira/browse/FLINK-7477 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Fix For: 1.4.0 > > > Currently, some cloud environments don't properly put the Hadoop jars into > {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should > check in {{config.sh}} if the {{hadoop}} binary is on the path and augment > our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in > our scripts. > This will improve the out-of-box experience of users that otherwise have to > manually set {{HADOOP_CLASSPATH}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)