It seems that we do not export HADOOP_CONF_DIR as environment variables in
current implementation, even though we have set the env.xxx flink config
options. It is only used to construct the classpath for the JM/TM process.
However, in "HadoopUtils"[2] we do not support getting the hadoop
configuration from classpath.


[1].
https://github.com/apache/flink/blob/release-1.11/flink-dist/src/main/flink-bin/bin/config.sh#L256
[2].
https://github.com/apache/flink/blob/release-1.11/flink-connectors/flink-hadoop-compatibility/src/main/java/org/apache/flink/api/java/hadoop/mapred/utils/HadoopUtils.java#L64


Best,
Yang

Best,
Yang

Flavio Pompermaier <pomperma...@okkam.it> 于2021年4月16日周五 上午3:55写道:

> Hi Robert,
> indeed my docker-compose does work only if I add also Hadoop and yarn home
> while I was expecting that those two variables were generated automatically
> just setting env.xxx variables in FLINK_PROPERTIES variable..
>
> I just want to understand what to expect, if I really need to specify
> Hadoop and yarn home as env variables or not
>
> Il gio 15 apr 2021, 20:39 Robert Metzger <rmetz...@apache.org> ha scritto:
>
>> Hi,
>>
>> I'm not aware of any known issues with Hadoop and Flink on Docker.
>>
>> I also tried what you are doing locally, and it seems to work:
>>
>> flink-jobmanager    | 2021-04-15 18:37:48,300 INFO
>>  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Starting
>> StandaloneSessionClusterEntrypoint.
>> flink-jobmanager    | 2021-04-15 18:37:48,338 INFO
>>  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Install
>> default filesystem.
>> flink-jobmanager    | 2021-04-15 18:37:48,375 INFO
>>  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Install
>> security context.
>> flink-jobmanager    | 2021-04-15 18:37:48,404 INFO
>>  org.apache.flink.runtime.security.modules.HadoopModule       [] - Hadoop
>> user set to flink (auth:SIMPLE)
>> flink-jobmanager    | 2021-04-15 18:37:48,408 INFO
>>  org.apache.flink.runtime.security.modules.JaasModule         [] - Jaas
>> file will be created as /tmp/jaas-811306162058602256.conf.
>> flink-jobmanager    | 2021-04-15 18:37:48,415 INFO
>>  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -
>> Initializing cluster services.
>>
>> Here's my code:
>>
>> https://gist.github.com/rmetzger/0cf4ba081d685d26478525bf69c7bd39
>>
>> Hope this helps!
>>
>> On Wed, Apr 14, 2021 at 5:37 PM Flavio Pompermaier <pomperma...@okkam.it>
>> wrote:
>>
>>> Hi everybody,
>>> I'm trying to set up reading from HDFS using docker-compose and Flink
>>> 1.11.3.
>>> If I pass 'env.hadoop.conf.dir' and 'env.yarn.conf.dir'
>>> using FLINK_PROPERTIES (under environment section of the docker-compose
>>> service) I see in the logs the following line:
>>>
>>> "Could not find Hadoop configuration via any of the supported method"
>>>
>>> If I'm not wrong, this means that the HADOOP_CONF_DIR is actually not
>>> generated by the run scripts.
>>> Indeed, If I add HADOOP_CONF_DIR and YARN_CONF_DIR (always under
>>> environment section of the docker-compose service) I don't see that line.
>>>
>>> Is this the expected behavior?
>>>
>>> Below the relevant docker-compose service I use (I've removed the
>>> content of HADOOP_CLASSPATH content because is too long and I didn't report
>>> the taskmanager that is similar):
>>>
>>> flink-jobmanager:
>>>     container_name: flink-jobmanager
>>>     build:
>>>       context: .
>>>       dockerfile: Dockerfile.flink
>>>       args:
>>>         FLINK_VERSION: 1.11.3-scala_2.12-java11
>>>     image: 'flink-test:1.11.3-scala_2.12-java11'
>>>     ports:
>>>       - "8091:8081"
>>>       - "8092:8082"
>>>     command: jobmanager
>>>     environment:
>>>       - |
>>>         FLINK_PROPERTIES=
>>>         jobmanager.rpc.address: flink-jobmanager
>>>         rest.port: 8081
>>>         historyserver.web.port: 8082
>>>         web.upload.dir: /opt/flink
>>>         env.hadoop.conf.dir: /opt/hadoop/conf
>>>         env.yarn.conf.dir: /opt/hadoop/conf
>>>       - |
>>>         HADOOP_CLASSPATH=...
>>>       - HADOOP_CONF_DIR=/opt/hadoop/conf
>>>       - YARN_CONF_DIR=/opt/hadoop/conf
>>>     volumes:
>>>       - 'flink_shared_folder:/tmp/test'
>>>       - 'flink_uploads:/opt/flink/flink-web-upload'
>>>       - 'flink_hadoop_conf:/opt/hadoop/conf'
>>>       - 'flink_hadoop_libs:/opt/hadoop-3.2.1/share'
>>>
>>>
>>> Thanks in advance for any support,
>>> Flavio
>>>
>>

Reply via email to