[
https://issues.apache.org/jira/browse/FLINK-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958984#comment-14958984
]
ASF GitHub Bot commented on FLINK-1984:
---------------------------------------
Github user rmetzger commented on the pull request:
https://github.com/apache/flink/pull/948#issuecomment-148403603
I tried running the code from this pull request again, this time using the
`mesos-playa` vagrant image, and it does not work for me.
I was following your instructions.
When did you test the changes recently?
My motivation to test this pull request goes down every time I'm testing
it. I've spun up a Mesos cluster on GCE two times, plus the VM now.
Maybe I'm doing it wrong, please let me know what I can do to get it to run.
CLI output:
```
vagrant@mesos:~/flink/build-target$ java
-Dlog4j.configuration=file://`pwd`/conf/log4j.properties -Dlog.file=logs.log
-cp lib/flink-dist-0.10-SNAPSHOT.jar
org.apache.flink.mesos.scheduler.FlinkScheduler --confDir conf/
I1015 14:05:01.591161 9992 sched.cpp:157] Version: 0.22.1
2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@712: Client
environment:zookeeper.version=zookeeper C client 3.4.5
2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@716: Client
environment:host.name=mesos
2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@723: Client
environment:os.name=Linux
2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@724: Client
environment:os.arch=3.16.0-30-generic
2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@725: Client
environment:os.version=#40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015
2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@733: Client
environment:user.name=vagrant
2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@741: Client
environment:user.home=/home/vagrant
2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@753: Client
environment:user.dir=/home/vagrant/flink/flink-dist/target/flink-0.10-SNAPSHOT-bin/flink-0.10-SNAPSHOT
2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@zookeeper_init@786:
Initiating client connection, host=127.0.0.1:2181 sessionTimeout=10000
watcher=0x7f67dac33a60 sessionId=0 sessionPasswd=<null> context=0x7f67f0004470
flags=0
2015-10-15 14:05:01,592:9991(0x7f67c6ffd700):ZOO_INFO@check_events@1703:
initiated connection to server [127.0.0.1:2181]
Embedded server listening at
http://127.0.0.1:40815
Press any key to stop.
2015-10-15 14:05:04,959:9991(0x7f67c6ffd700):ZOO_INFO@check_events@1750:
session establishment complete on server [127.0.0.1:2181],
sessionId=0x1506b6312fa000b, negotiated timeout=10000
I1015 14:05:04.959841 10024 group.cpp:313] Group process
(group(1)@127.0.1.1:57437) connected to ZooKeeper
I1015 14:05:04.959899 10024 group.cpp:790] Syncing group operations: queue
size (joins, cancels, datas) = (0, 0, 0)
I1015 14:05:04.959928 10024 group.cpp:385] Trying to create path '/mesos'
in ZooKeeper
I1015 14:05:05.204282 10024 detector.cpp:138] Detected a new leader:
(id='2')
I1015 14:05:05.204489 10024 group.cpp:659] Trying to get
'/mesos/info_0000000002' in ZooKeeper
I1015 14:05:05.303072 10024 detector.cpp:452] A new leading master
([email protected]:5050) is detected
I1015 14:05:05.303467 10024 sched.cpp:254] New master detected at
[email protected]:5050
I1015 14:05:05.303890 10024 sched.cpp:264] No credentials provided.
Attempting to register without authentication
I1015 14:05:05.851562 10024 sched.cpp:448] Framework registered with
20151015-120419-16842879-5050-1244-0000
```
log file content
```
14:04:54,564 WARN org.apache.hadoop.util.NativeCodeLoader
- Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable
14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
-
--------------------------------------------------------------------------------
14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Starting JobManager (Version: 0.10-SNAPSHOT, Rev:d905af0,
Date:06.10.2015 @ 19:37:22 UTC)
14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Current user: vagrant
14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.7/24.79-b02
14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Maximum heap size: 592 MiBytes
14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- JAVA_HOME: (not set)
14:04:55,823 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Hadoop version: 2.3.0
14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- JVM Options:
14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
-
-Dlog4j.configuration=file:///home/vagrant/flink/build-target/conf/log4j.properties
14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- -Dlog.file=logs.log
14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Program Arguments:
14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- --confDir
14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- conf/
14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
-
--------------------------------------------------------------------------------
14:04:55,875 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Maximum number of open file descriptors is 4096
14:04:55,875 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Loading configuration from
/home/vagrant/flink/flink-dist/target/flink-0.10-SNAPSHOT-bin/flink-0.10-SNAPSHOT/conf
14:04:58,375 INFO org.apache.flink.runtime.jobmanager.JobManager
- Starting JobManager
14:04:58,377 INFO org.apache.flink.runtime.jobmanager.JobManager
- Starting JobManager actor system at localhost:6123.
14:04:59,700 INFO org.eclipse.jetty.util.log
- jetty-0.10-SNAPSHOT
14:05:01,985 INFO org.eclipse.jetty.util.log
- Started [email protected]:40815
14:05:07,698 INFO akka.event.slf4j.Slf4jLogger
- Slf4jLogger started
14:05:07,750 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Accepting
14:05:07,960 INFO Remoting
- Starting remoting
14:05:09,241 INFO Remoting
- Remoting started; listening on addresses
:[akka.tcp://[email protected]:6123]
14:05:09,248 INFO org.apache.flink.runtime.jobmanager.JobManager
- Starting JobManager actor
14:05:09,597 INFO org.apache.flink.runtime.blob.BlobServer
- Created BLOB server storage directory
/tmp/blobStore-9b7614f7-7d0d-4c5e-b4c6-911f0ab845ef
14:05:09,597 INFO org.apache.flink.runtime.blob.BlobServer
- Started BLOB server at 0.0.0.0:40000 - max concurrent requests: 50 - max
backlog: 1000
14:05:10,470 INFO org.apache.flink.runtime.jobmanager.JobManager
- Starting JobManager at akka.tcp://[email protected]:6123/user/jobmanager.
14:05:10,471 INFO org.apache.flink.runtime.jobmanager.MemoryArchivist
- Started memory archivist akka://flink/user/archive
14:05:10,563 INFO org.apache.flink.runtime.jobmanager.JobManager
- JobManager akka.tcp://[email protected]:6123/user/jobmanager was granted
leadership with leader session ID None.
14:05:10,593 INFO org.apache.flink.runtime.jobmanager.JobManager
- Starting JobManger web frontend
14:05:10,735 INFO org.apache.flink.runtime.jobmanager.web.WebInfoServer
- Setting up web info server, using web-root directory
jar:file:/home/vagrant/flink/flink-dist/target/flink-0.10-SNAPSHOT-bin/flink-0.10-SNAPSHOT/lib/flink-dist-0.10-SNAPSHOT.jar!/web-docs-infoserver.
14:05:11,162 INFO org.eclipse.jetty.util.log
- jetty-0.10-SNAPSHOT
14:05:11,165 INFO org.eclipse.jetty.util.log
- Started [email protected]:8081
14:05:11,166 INFO org.apache.flink.runtime.jobmanager.web.WebInfoServer
- Started web info server for JobManager on 0.0.0.0:8081
14:05:14,936 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Declining offer(s) from slave 20151015-120419-16842879-5050-1244-S0
offered [cpus: 1.5 | mem : 488.0 | disk: 33044.0] required [cpus: 0.5 | mem:
512.0 | disk: 1024.0]
14:05:15,948 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- statusUpdate received from taskId: TaskManager_1 slaveId:
20151015-120419-16842879-5050-1244-S0 [TASK_LOST]
14:05:15,948 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Lost taskManager with TaskId: TaskManager_1 on slave:
20151015-120419-16842879-5050-1244-S0
14:05:16,939 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Accepting
14:05:17,092 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- statusUpdate received from taskId: TaskManager_2 slaveId:
20151015-120419-16842879-5050-1244-S0 [TASK_LOST]
14:05:17,092 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Lost taskManager with TaskId: TaskManager_2 on slave:
20151015-120419-16842879-5050-1244-S0
14:05:17,939 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Accepting
14:05:18,096 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- statusUpdate received from taskId: TaskManager_3 slaveId:
20151015-120419-16842879-5050-1244-S0 [TASK_LOST]
14:05:18,096 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Lost taskManager with TaskId: TaskManager_3 on slave:
20151015-120419-16842879-5050-1244-S0
14:05:18,940 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Accepting
14:05:19,112 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- statusUpdate received from taskId: TaskManager_4 slaveId:
20151015-120419-16842879-5050-1244-S0 [TASK_LOST]
14:05:19,113 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$
- Lost taskManager with TaskId: TaskManager_4 on slave:
20151015-120419-16842879-5050-1244-S0
.... this goes on forever? ...
```
mesos file `mesos-slave.WARNING`:
```
Log file created at: 2015/10/15 12:04:40
Running on machine: mesos
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
W1015 12:04:40.464870 1310 slave.cpp:1934] Ignoring updating pid for
framework 20151007-005549-16842879-5050-1191-0001 because it does not exist
W1015 12:05:08.030145 1313 slave.cpp:1934] Ignoring updating pid for
framework 20151007-005549-16842879-5050-1191-0000 because it does not exist
E1015 14:05:14.378486 1312 slave.cpp:3112] Container
'74dc3694-16ec-470f-88c6-b06b7f295682' for executor 'executor_1' of framework
'20151015-120419-16842879-5050-1244-0000' failed to start: Failed to fetch URIs
for container '74dc3694-16ec-470f-88c6-b06b7f295682'with exit status: 256
E1015 14:05:15.768391 1315 slave.cpp:3461] Failed to unmonitor container
for executor executor_1 of framework 20151015-120419-16842879-5050-1244-0000:
Not monitored
W1015 14:05:15.851459 1312 containerizer.cpp:814] Ignoring update for
unknown container: 74dc3694-16ec-470f-88c6-b06b7f295682
E1015 14:05:16.989680 1307 slave.cpp:3112] Container
'2af2d3c0-e30c-4405-9ff1-7f4389bb62e9' for executor 'executor_2' of framework
'20151015-120419-16842879-5050-1244-0000' failed to start: Failed to fetch URIs
for container '2af2d3c0-e30c-4405-9ff1-7f4389bb62e9'with exit status: 256
E1015 14:05:17.090631 1312 slave.cpp:3461] Failed to unmonitor container
for executor executor_2 of framework 20151015-120419-16842879-5050-1244-0000:
Not monitored
W1015 14:05:17.091418 1305 containerizer.cpp:814] Ignoring update for
unknown container: 2af2d3c0-e30c-4405-9ff1-7f4389bb62e9
E1015 14:05:17.993669 1310 slave.cpp:3112] Container
'8cbc46f8-3200-4f9b-9134-099a0f6f3541' for executor 'executor_3' of framework
'20151015-120419-16842879-5050-1244-0000' failed to start: Failed to fetch URIs
for container '8cbc46f8-3200-4f9b-9134-099a0f6f3541'with exit status: 256
E1015 14:05:18.095177 1310 slave.cpp:3461] Failed to unmonitor container
for executor executor_3 of framework 20151015-120419-16842879-5050-1244-0000:
Not monitored
W1015 14:05:18.095211 1310 containerizer.cpp:814] Ignoring update for
unknown container: 8cbc46f8-3200-4f9b-9134-099a0f6f3541
E1015 14:05:19.006584 1305 slave.cpp:3112] Container
'aca9e80a-5a34-4c29-a123-f025dc4946fe' for executor 'executor_4' of framework
'20151015-120419-16842879-5050-1244-0000' failed to start: Failed to fetch URIs
for container 'aca9e80a-5a34-4c29-a123-f025dc4946fe'with exit status: 256
```
I can not find any log files for the taskamanger
> Integrate Flink with Apache Mesos
> ---------------------------------
>
> Key: FLINK-1984
> URL: https://issues.apache.org/jira/browse/FLINK-1984
> Project: Flink
> Issue Type: New Feature
> Components: New Components
> Reporter: Robert Metzger
> Priority: Minor
> Attachments: 251.patch
>
>
> There are some users asking for an integration of Flink into Mesos.
> There also is a pending pull request for adding Mesos support for Flink:
> https://github.com/apache/flink/pull/251
> But the PR is insufficiently tested. I'll add the code of the pull request to
> this JIRA in case somebody wants to pick it up in the future.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)