[ 
https://issues.apache.org/jira/browse/FLINK-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367750#comment-16367750
 ] 

Ken Krugler commented on FLINK-7477:
------------------------------------

Hi [~aljoscha] - I encountered this issue when running locally (using 
{{bin/start-local.sh}}). And yes, on YARN I would expect that the Hadoop jars 
are added to the classpath on the nodes. The challenge comes from code that 
executes as part of creating/submitting the job, where it also needs Hadoop (or 
AWS) support, but you don't want to include those jars in the uber jar for 
obvious reasons. In that case ensuring the Hadoop/etc jars are on the classpath 
when main() is executing, _and_ they match the version being used by YARN, is 
critical and is a common source of problems (for Flink and regular Hadoop jobs).

> Use "hadoop classpath" to augment classpath when available
> ----------------------------------------------------------
>
>                 Key: FLINK-7477
>                 URL: https://issues.apache.org/jira/browse/FLINK-7477
>             Project: Flink
>          Issue Type: Bug
>          Components: Startup Shell Scripts
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>            Priority: Major
>             Fix For: 1.4.0
>
>
> Currently, some cloud environments don't properly put the Hadoop jars into 
> {{HADOOP_CLASSPATH}} (or don't set {{HADOOP_CLASSPATH}}) at all. We should 
> check in {{config.sh}} if the {{hadoop}} binary is on the path and augment 
> our {{INTERNAL_HADOOP_CLASSPATHS}} with the result of {{hadoop classpath}} in 
> our scripts.
> This will improve the out-of-box experience of users that otherwise have to 
> manually set {{HADOOP_CLASSPATH}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to