[
https://issues.apache.org/jira/browse/FLINK-20267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17238105#comment-17238105
]
Matthias edited comment on FLINK-20267 at 11/24/20, 1:06 PM:
-------------------------------------------------------------
The fix of PR #14171 was tested on an AWS EMR:
* Pre-fix version:
{code:java}
[hadoop@ip-172-31-36-74 ~]$ ./flink-1.12-pre/bin/flink run -m yarn-cluster -p 4
-yjm 1024m -ytm 4096m flink-1.12-pre/examples/streaming/StateMachineExample.jar
Setting HBASE_CONF_DIR=/etc/hbase/conf because no HBASE_CONF_DIR was set.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/home/hadoop/flink-1.12-pre/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
java.lang.RuntimeException: unable to generate a JAAS configuration file at
org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:170)
at
org.apache.flink.runtime.security.modules.JaasModule.install(JaasModule.java:94)
at
org.apache.flink.runtime.security.SecurityUtils.installModules(SecurityUtils.java:78)
at
org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:59)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1045) Caused
by: java.nio.file.FileAlreadyExistsException: /tmp
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at
sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
at java.nio.file.Files.createDirectory(Files.java:674)
at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
at java.nio.file.Files.createDirectories(Files.java:727)
at
org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:162)
... 4 more{code}
* Post-fix version:
{code:java}
[hadoop@ip-172-31-36-74 ~]$ ./flink-1.12-SNAPSHOT/bin/flink run -m yarn-cluster
-p 4 -yjm 1024m -ytm 4096m flink-1.12-SNAPSHOT/examples/streaming
/StateMachineExample.jar
Setting HBASE_CONF_DIR=/etc/hbase/conf because no HBASE_CONF_DIR was set.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/home/hadoop/flink-1.12-SNAPSHOT/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Usage with built-in data generator: StateMachineExample [--error-rate
<probability-of-invalid-transition>] [--sleep <sleep-per-record-in-ms>] Usage
with Kafka: StateMachineExample --kafka-topic <topic> [--brokers <brokers>]
Options for both the above setups: [--backend <file|rocks>] [--checkpoint-dir
<filepath>] [--async-checkpoints <true|false>] [--incremental-checkpoints
<true|false>] [--output <filepath> OR null for stdout]Using standalone source
with error rate 0.000000 and sleep delay 1 millis
2020-11-24 12:53:38,616 WARN
org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration
directory ('/home/hadoop/flink-1.12-SNAPSHOT/conf') already contains a LOG4J
config file.If you want to use logback, then please delete or rename the log
configuration file.
2020-11-24 12:53:38,767 INFO org.apache.hadoop.yarn.client.RMProxy [] -
Connecting to ResourceManager at
ip-172-31-36-74.eu-central-1.compute.internal/172.31.36.74:8032
2020-11-24 12:53:38,892 INFO org.apache.hadoop.yarn.client.AHSProxy [] -
Connecting to Application History server at
ip-172-31-36-74.eu-central-1.compute.internal/172.31.36.74:10200
2020-11-24 12:53:38,902 INFO org.apache.flink.yarn.YarnClusterDescriptor [] -
No path for the flink jar passed. Using the location of class
org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2020-11-24 12:53:39,044 INFO org.apache.hadoop.conf.Configuration [] -
resource-types.xml not found
2020-11-24 12:53:39,045 INFO org.apache.hadoop.yarn.util.resource.ResourceUtils
[] - Unable to find 'resource-types.xml'.
2020-11-24 12:53:39,050 INFO org.apache.hadoop.yarn.util.resource.ResourceUtils
[] - Adding resource type - name = memory-mb, units = Mi, type = COUNTABLE
2020-11-24 12:53:39,050 INFO org.apache.hadoop.yarn.util.resource.ResourceUtils
[] - Adding resource type - name = vcores, units = , type = COUNTABLE
2020-11-24 12:53:39,052 WARN org.apache.flink.yarn.YarnClusterDescriptor [] -
Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.
The Flink YARN Client needs one of these to be set to properly load the Hadoop
configuration for accessing YARN.
2020-11-24 12:53:39,107 INFO org.apache.flink.yarn.YarnClusterDescriptor [] -
Cluster specification: ClusterSpecification{masterMemoryMB=1024,
taskManagerMemoryMB=4096, slotsPerTaskManager=1}
2020-11-24 12:53:40,475 INFO org.apache.flink.yarn.YarnClusterDescriptor [] -
Submitting application master application_1606220091660_0001
2020-11-24 12:53:40,806 INFO
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl [] - Submitted
application application_1606220091660_0001
2020-11-24 12:53:40,806 INFO org.apache.flink.yarn.YarnClusterDescriptor [] -
Waiting for the cluster to be allocated
2020-11-24 12:53:40,809 INFO org.apache.flink.yarn.YarnClusterDescriptor [] -
Deploying cluster, current state ACCEPTED
2020-11-24 12:53:46,104 INFO org.apache.flink.yarn.YarnClusterDescriptor [] -
YARN application has been deployed successfully.
2020-11-24 12:53:46,105 INFO org.apache.flink.yarn.YarnClusterDescriptor [] -
Found Web Interface ip-172-31-38-181.eu-central-1.compute.internal:43067 of
application 'application_1606220091660_0001'. Job has been submitted with JobID
41d1df99e30414d581183c58aed615d9{code}
was (Author: mapohl):
The fix of PR #14171 was tested on an AWS EMR:
* Pre-fix version:
{code:java}
[hadoop@ip-172-31-36-74 ~]$ ./flink-1.12-pre/bin/flink run -m yarn-cluster -p 4
-yjm 1024m -ytm 4096m flink-1.12-pre/examples/streaming/StateMachineExample.jar
Setting HBASE_CONF_DIR=/etc/hbase/conf because no HBASE_CONF_DIR was set.
SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in
[jar:file:/home/hadoop/flink-1.12-pre/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
java.lang.RuntimeException: unable to generate a JAAS configuration file at
org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:170)
at
org.apache.flink.runtime.security.modules.JaasModule.install(JaasModule.java:94)
at
org.apache.flink.runtime.security.SecurityUtils.installModules(SecurityUtils.java:78)
at
org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:59)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1045) Caused
by: java.nio.file.FileAlreadyExistsException: /tmp
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at
sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
at java.nio.file.Files.createDirectory(Files.java:674)
at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
at java.nio.file.Files.createDirectories(Files.java:727)
at
org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:162)
... 4 more{code}
* Post-fix version:
{code:java}
[hadoop@ip-172-31-36-74 ~]$ ./flink-1.12-SNAPSHOT/bin/flink run -m yarn-cluster
-p 4 -yjm 1024m -ytm 4096m flink-1.12-SNAPSHOT/examples/streaming
/StateMachineExample.jar Setting HBASE_CONF_DIR=/etc/hbase/conf because no
HBASE_CONF_DIR was set. SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/home/hadoop/flink-1.12-SNAPSHOT/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation. SLF4J: Actual binding is of type
[org.apache.logging.slf4j.Log4jLoggerFactory] Usage with built-in data
generator: StateMachineExample [--error-rate
<probability-of-invalid-transition>] [--sleep <sleep-per-record-in-ms>] Usage
with Kafka: StateMachineExample --kafka-topic <topic> [--brokers <brokers>]
Options for both the above setups: [--backend <file|rocks>] [--checkpoint-dir
<filepath>] [--async-checkpoints <true|false>] [--incremental-checkpoints
<true|false>] [--output <filepath> OR null for stdout]Using standalone source
with error rate 0.000000 and sleep delay 1 millis2020-11-24 12:53:38,616 WARN
org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration
directory ('/home/hadoop/flink-1.12-SNAPSHOT/conf') already contains a LOG4J
config file.If you want to use logback, then please delete or rename the log
configuration file. 2020-11-24 12:53:38,767 INFO
org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at
ip-172-31-36-74.eu-central-1.compute.internal/172.31.36.74:8032 2020-11-24
12:53:38,892 INFO org.apache.hadoop.yarn.client.AHSProxy [] - Connecting to
Application History server at
ip-172-31-36-74.eu-central-1.compute.internal/172.31.36.74:10200 2020-11-24
12:53:38,902 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for
the flink jar passed. Using the location of class
org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2020-11-24
12:53:39,044 INFO org.apache.hadoop.conf.Configuration [] - resource-types.xml
not found 2020-11-24 12:53:39,045 INFO
org.apache.hadoop.yarn.util.resource.ResourceUtils [] - Unable to find
'resource-types.xml'. 2020-11-24 12:53:39,050 INFO
org.apache.hadoop.yarn.util.resource.ResourceUtils [] - Adding resource type -
name = memory-mb, units = Mi, type = COUNTABLE 2020-11-24 12:53:39,050 INFO
org.apache.hadoop.yarn.util.resource.ResourceUtils [] - Adding resource type -
name = vcores, units = , type = COUNTABLE 2020-11-24 12:53:39,052 WARN
org.apache.flink.yarn.YarnClusterDescriptor [] - Neither the HADOOP_CONF_DIR
nor the YARN_CONF_DIR environment variable is set. The Flink YARN Client needs
one of these to be set to properly load the Hadoop configuration for accessing
YARN. 2020-11-24 12:53:39,107 INFO org.apache.flink.yarn.YarnClusterDescriptor
[] - Cluster specification: ClusterSpecification{masterMemoryMB=1024,
taskManagerMemoryMB=4096, slotsPerTaskManager=1} 2020-11-24 12:53:40,475 INFO
org.apache.flink.yarn.YarnClusterDescriptor [] - Submitting application master
application_1606220091660_0001 2020-11-24 12:53:40,806 INFO
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl [] - Submitted
application application_1606220091660_0001 2020-11-24 12:53:40,806 INFO
org.apache.flink.yarn.YarnClusterDescriptor [] - Waiting for the cluster to be
allocated 2020-11-24 12:53:40,809 INFO
org.apache.flink.yarn.YarnClusterDescriptor [] - Deploying cluster, current
state ACCEPTED 2020-11-24 12:53:46,104 INFO
org.apache.flink.yarn.YarnClusterDescriptor [] - YARN application has been
deployed successfully. 2020-11-24 12:53:46,105 INFO
org.apache.flink.yarn.YarnClusterDescriptor [] - Found Web Interface
ip-172-31-38-181.eu-central-1.compute.internal:43067 of application
'application_1606220091660_0001'. Job has been submitted with JobID
41d1df99e30414d581183c58aed615d9{code}
> JaasModule prevents Flink from starting if working directory is a symbolic
> link
> -------------------------------------------------------------------------------
>
> Key: FLINK-20267
> URL: https://issues.apache.org/jira/browse/FLINK-20267
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.12.0
> Reporter: Till Rohrmann
> Assignee: Matthias
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 1.12.0
>
>
> [~AHeise] reported that starting Flink on EMR fails with
> {code}
> java.lang.RuntimeException: unable to generate a JAAS configuration file
> at
> org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:170)
> at
> org.apache.flink.runtime.security.modules.JaasModule.install(JaasModule.java:94)
> at
> org.apache.flink.runtime.security.SecurityUtils.installModules(SecurityUtils.java:78)
> at
> org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:59)
> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1045)
> Caused by: java.nio.file.FileAlreadyExistsException: /tmp
> at
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
> at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> at
> sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
> at java.nio.file.Files.createDirectory(Files.java:674)
> at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
> at java.nio.file.Files.createDirectories(Files.java:727)
> at
> org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:162)
> ... 4 more
> {code}
> The problem is that on EMR {{/tmp}} is a symbolic link. Due to FLINK-19252
> where we introduced the [creation of the working
> directory|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/JaasModule.java#L162]
> in order to create the default Jaas config file, the start up process fails
> if the path for the working directory is not a directory (apparently
> {{Files.createDirectories}} cannot deal with symbolic links).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)