[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2018-08-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592297#comment-16592297
 ] 

ASF GitHub Bot commented on FLINK-7477:
---

ruankd commented on issue #4566: [FLINK-7477] [FLINK-7480] Various improvements 
to Flink scripts
URL: https://github.com/apache/flink/pull/4566#issuecomment-415904348
 
 
   Hey, notice that this 
[commit](https://github.com/apache/flink/commit/0a0f6ed6c3d6cff702e4322293340274bea5e7d9)
 is part of this PR but it not merged into branch 1.5 and 1.6, neither in 
master. I wonder whether it will be merged?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2018-05-23 Thread Keda Ruan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488213#comment-16488213
 ] 

Keda Ruan commented on FLINK-7477:
--

Hey, just curious whether this commit will be merged into 1.5.x release?

Thanks.

> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2018-02-26 Thread Ken Krugler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377679#comment-16377679
 ] 

Ken Krugler commented on FLINK-7477:


The odd stuff (some of which might be bogus)...
 # I had to explicitly add {{kryo.serializers}} as a dependency.
 # Ditto for {{org.jdom:jdom}}, which our {{Tika}} dependency should have 
pulled in transitively, but it was missing.
 # A bunch of stuff with getting integration tests working (including 
{{maven-failsafe-plugin}} and {{build-helper-maven-plugin}} among others), but 
that just happened to be at the same time as the AWS client class issue, so 
unrelated.

Not sure how different our (non-flink) shaded exclusion list wound up being 
from "regular" Flink, here's what it is now:

 
{code:java}
log4j:log4j
org.scala-lang:scala-library
org.scala-lang:scala-compiler
org.scala-lang:scala-reflect
com.data-artisans:flakka-actor_*
com.data-artisans:flakka-remote_*
com.data-artisans:flakka-slf4j_*
io.netty:netty-all
io.netty:netty
commons-fileupload:commons-fileupload
org.apache.avro:avro
commons-collections:commons-collections
org.codehaus.jackson:jackson-core-asl
org.codehaus.jackson:jackson-mapper-asl
com.thoughtworks.paranamer:paranamer
org.xerial.snappy:snappy-java
org.apache.commons:commons-compress
org.tukaani:xz
com.esotericsoftware.kryo:kryo
com.esotericsoftware.minlog:minlog
org.objenesis:objenesis
com.twitter:chill_*
com.twitter:chill-java
commons-lang:commons-lang
junit:junit
org.apache.commons:commons-lang3
org.slf4j:slf4j-api
org.slf4j:slf4j-log4j12
log4j:log4j
org.apache.commons:commons-math
org.apache.sling:org.apache.sling.commons.json
commons-logging:commons-logging
commons-codec:commons-codec
stax:stax-api
com.typesafe:config
org.uncommons.maths:uncommons-maths
com.github.scopt:scopt_*
commons-io:commons-io
commons-cli:commons-cli
{code}

> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2018-02-19 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368926#comment-16368926
 ] 

Aljoscha Krettek commented on FLINK-7477:
-

Could you maybe comment on that magic?

> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2018-02-18 Thread Ken Krugler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368677#comment-16368677
 ] 

Ken Krugler commented on FLINK-7477:


It works (at least when running with YARN via EMR). I believe that's because 
the version of Hadoop on the EMR master matches what we're running against; on 
my machine, I have to switch between multiple versions of Hadoop for various 
(consulting) clients who are on different versions of Hadoop, and my {{hadoop}} 
symlink wound up pointing to a different version of Hadoop than what Flink was 
using.

Related note - the 1.4 release fixed some shading issues we were running into 
with AWS client classes (mostly around {{HttpCore}} stuff), but to get 
everything working properly I felt like I did some voodoo with class exclusions 
in the {{maven-shade-plugin}} section of my {{pom.xml}}, which still feels 
fragile.

> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2018-02-17 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368202#comment-16368202
 ] 

Aljoscha Krettek commented on FLINK-7477:
-

[~kkrugler] So on YARN, the current setup works for you or do you also have to 
remove the {{hadoop classpath}} parts from the scripts to make it work?

> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2018-02-16 Thread Ken Krugler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367750#comment-16367750
 ] 

Ken Krugler commented on FLINK-7477:


Hi [~aljoscha] - I encountered this issue when running locally (using 
{{bin/start-local.sh}}). And yes, on YARN I would expect that the Hadoop jars 
are added to the classpath on the nodes. The challenge comes from code that 
executes as part of creating/submitting the job, where it also needs Hadoop (or 
AWS) support, but you don't want to include those jars in the uber jar for 
obvious reasons. In that case ensuring the Hadoop/etc jars are on the classpath 
when main() is executing, _and_ they match the version being used by YARN, is 
critical and is a common source of problems (for Flink and regular Hadoop jobs).

> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2018-02-16 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366735#comment-16366735
 ] 

Aljoscha Krettek commented on FLINK-7477:
-

I created FLINK-8668

> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2018-02-16 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366733#comment-16366733
 ] 

Aljoscha Krettek commented on FLINK-7477:
-

[~kkrugler] I added this code but now I would be in favour of just removing it, 
along with the code that uses {{hbase classpath}}. 

Out of curiosity, are you running on YARN? Shouldn't this also include the 
Hadoop dependencies in your classpath anyways when executing the TaskManagers 
on YARN?

> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2018-01-31 Thread Ken Krugler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347685#comment-16347685
 ] 

Ken Krugler commented on FLINK-7477:


I posted to the mailing list about an issue that this change seemed to create 
for me, but didn't hear back.
{quote}With Flink 1.4 and FLINK-7477, I ran into a problem with jar versions 
for HttpCore, when using the AWS SDK to read from S3.
I believe the issue is that even when setting classloader.resolve-order to 
child-first in flink-conf.yaml, the change to put all jars returned by “hadoop 
classpath” on the classpath means that classes in these jars are found before 
the classes in my shaded Flink uber jar.
If I ensure that I don’t have the “hadoop” command set up on my Bash path, then 
I don’t run into this issue.
Does this make sense, or is there something else going on that I can fix to 
avoid this situation?{quote}
 
Any input? Thanks...Ken

> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2017-08-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135189#comment-16135189
 ] 

ASF GitHub Bot commented on FLINK-7477:
---

Github user aljoscha closed the pull request at:

https://github.com/apache/flink/pull/4566


> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2017-08-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134547#comment-16134547
 ] 

ASF GitHub Bot commented on FLINK-7477:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/4566
  
It's a bash command, AFAIK.


> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2017-08-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134351#comment-16134351
 ] 

ASF GitHub Bot commented on FLINK-7477:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/4566
  
+1 to merge (assuming `command` is available on all operating systems / or 
is a bash command?)


> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2017-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16133174#comment-16133174
 ] 

ASF GitHub Bot commented on FLINK-7477:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/4566
  
@rmetzger I addressed your comments.


> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2017-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16133120#comment-16133120
 ] 

ASF GitHub Bot commented on FLINK-7477:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/4566#discussion_r133985865
  
--- Diff: flink-dist/src/main/flink-bin/bin/config.sh ---
@@ -351,8 +351,20 @@ if [ -z "$HADOOP_CONF_DIR" ]; then
 fi
 fi
 
+# try and set HADOOP_CONF_DIR to some common default if it's not set
+if [ -z "$HADOOP_CONF_DIR" ]; then
+if [ -d "/etc/hadoop/conf" ]; then
+HADOOP_CONF_DIR="/etc/hadoop/conf"
+fi
+fi
+
 
INTERNAL_HADOOP_CLASSPATHS="${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}"
 
+# check if the "hadoop" binary is available, if yes, use that to augment 
the CLASSPATH
+if command -v hadoop >/dev/null 2>&1; then
+
INTERNAL_HADOOP_CLASSPATHS="${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:`hadoop
 classpath`"
--- End diff --

I would actually append `INTERNAL_HADOOP_CLASSPATHS` instead of overwriting 
it. 


> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2017-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16133119#comment-16133119
 ] 

ASF GitHub Bot commented on FLINK-7477:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/4566#discussion_r133985612
  
--- Diff: flink-dist/src/main/flink-bin/bin/config.sh ---
@@ -351,8 +351,20 @@ if [ -z "$HADOOP_CONF_DIR" ]; then
 fi
 fi
 
+# try and set HADOOP_CONF_DIR to some common default if it's not set
+if [ -z "$HADOOP_CONF_DIR" ]; then
+if [ -d "/etc/hadoop/conf" ]; then
+HADOOP_CONF_DIR="/etc/hadoop/conf"
--- End diff --

I would suggest to print a message to the user that we are using this 
HADOOP_CONF_DIR


> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7477) Use "hadoop classpath" to augment classpath when available

2017-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16133044#comment-16133044
 ] 

ASF GitHub Bot commented on FLINK-7477:
---

GitHub user aljoscha opened a pull request:

https://github.com/apache/flink/pull/4566

[FLINK-7477] [FLINK-7480] Various improvements to Flink scripts



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/flink hadoop-env-improvements

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4566.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4566


commit 6b4d7e5e09dcd913fbb9c84c59fc8a10e6c662cc
Author: Aljoscha Krettek 
Date:   2017-08-18T14:39:41Z

[FLINK-7477] Use "hadoop classpath" to augment classpath when available

This improves the out-of-box experience on GCE and AWS, both of which
don't set a HADOOP_CLASSPATH but have "hadoop" available on the $PATH.

commit f63e2d03d739014f0cd94634d731e552a02c76d9
Author: Aljoscha Krettek 
Date:   2017-08-18T14:40:55Z

[FLINK-7480] Set HADOOP_CONF_DIR to sane default if not set

This improves the out-of-box experience on GCE and AWS, both of which
don't set HADOOP_CONF_DIR by default but use /etc/hadoop/conf




> Use "hadoop classpath" to augment classpath when available
> --
>
> Key: FLINK-7477
> URL: https://issues.apache.org/jira/browse/FLINK-7477
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)