Hi Felipe,

I am glad that you were able to fix the problem yourself.

> But I suppose that Mesos will allocate Slots and Task Managers
dynamically.
> Is that right?

Yes, that is the case since Flink 1.5 [1].

> Then I put the number of "mesos.resourcemanager.tasks.cpus" to be equal or
> less the available cores on a single node of the cluster. I am not sure
about
> this parameter, but only after this configuration it worked.

I would need to see JobManager and Mesos logs to understand why this
resolved
your issue. If you do not set mesos.resourcemanager.tasks.cpus explicitly,
Flink will request CPU resources equal to the number of TaskManager slots
(taskmanager.numberOfTaskSlots) [2]. Maybe this value was too high in your
configuration?

Best,
Gary


[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077
[2]
https://github.com/apache/flink/blob/0a405251b297109fde1f9a155eff14be4d943887/flink-mesos/src/main/java/org/apache/flink/mesos/runtime/clusterframework/MesosTaskManagerParameters.java#L344

On Tue, Sep 10, 2019 at 10:41 AM Felipe Gutierrez <
felipe.o.gutier...@gmail.com> wrote:

> I managed to find what was going wrong. I will write here just for the
> record.
>
> First, the master machine was not login automatically at itself. So I had
> to give permission for it.
>
> chmod og-wx ~/.ssh/authorized_keys
> chmod 750 $HOME
>
> Then I put the number of "mesos.resourcemanager.tasks.cpus" to be equal or
> less the available cores on a single node of the cluster. I am not sure
> about this parameter, but only after this configuration it worked.
>
> Felipe
> *--*
> *-- Felipe Gutierrez*
>
> *-- skype: felipe.o.gutierrez*
> *--* *https://felipeogutierrez.blogspot.com
> <https://felipeogutierrez.blogspot.com>*
>
>
> On Fri, Sep 6, 2019 at 10:36 AM Felipe Gutierrez <
> felipe.o.gutier...@gmail.com> wrote:
>
>> Hi,
>>
>> I am running Mesos without DC/OS [1] and Flink on it. Whe I start my
>> cluster I receive some messages that I suppose everything was started.
>> However, I see 0 slats available on the Flink web dashboard. But I suppose
>> that Mesos will allocate Slots and Task Managers dynamically. Is that right?
>>
>> $ ./bin/mesos-appmaster.sh &
>> [1] 16723
>> flink@r03:~/flink-1.9.0$ I0906 10:22:45.080328 16943 sched.cpp:239]
>> Version: 1.9.0
>> I0906 10:22:45.082672 16996 sched.cpp:343] New master detected at
>> mas...@xxx.xxx.xxx.xxx:5050
>> I0906 10:22:45.083276 16996 sched.cpp:363] No credentials provided.
>> Attempting to register without authentication
>> I0906 10:22:45.086840 16997 sched.cpp:751] Framework registered with
>> 22f6a553-e8ac-42d4-9a90-96a8d5f002f0-0003
>>
>> Then I deploy my Flink application. When I use the first command to
>> deploy the application starts. However, the tasks remain CREATED until
>> Flink throws a timeout exception. In other words, it never turns to RUNNING.
>> When I use the second comman to deploy the application it does not start
>> and I receive the exception of "Could not allocate all requires slots
>> within timeout of 300000 ms. Slots required: 2". The full stacktrace is
>> below.
>>
>> $ /home/flink/flink-1.9.0/bin/flink run
>> /home/flink/hello-flink-mesos/target/hello-flink-mesos.jar &
>> $ ./bin/mesos-appmaster-job.sh run
>> /home/flink/hello-flink-mesos/target/hello-flink-mesos.jar &
>>
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/mesos.html#mesos-without-dcos
>> ps.: my application runs normally on a standalone Flink cluster.
>>
>> ------------------------------------------------------------
>>  The program finished with the following exception:
>>
>> org.apache.flink.client.program.ProgramInvocationException: Job failed.
>> (JobID: 7ad8d71faaceb1ac469353452c43dc2a)
>> at
>> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:262)
>> at
>> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:338)
>> at
>> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:60)
>> at
>> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1507)
>> at org.hello_flink_mesos.App.<init>(App.java:35)
>> at org.hello_flink_mesos.App.main(App.java:285)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at
>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:576)
>> at
>> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:438)
>> at
>> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:274)
>> at
>> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:746)
>> at
>> org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:273)
>> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
>> at
>> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010)
>> at
>> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
>> at
>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
>> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job
>> execution failed.
>> at
>> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
>> at
>> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:259)
>> ... 22 more
>> Caused by:
>> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException:
>> Could not allocate all requires slots within timeout of 300000 ms. Slots
>> required: 2, slots allocated: 0, previous allocation IDs: [], execution
>> status: completed exceptionally: java.util.concurrent.CompletionException:
>> java.util.concurrent.CompletionException:
>> java.util.concurrent.TimeoutException/java.util.concurrent.CompletableFuture@b520de3[Completed
>> exceptionally], incomplete: 
>> java.util.concurrent.CompletableFuture@36f3d30c[Not
>> completed, 1 dependents]
>> at
>> org.apache.flink.runtime.executiongraph.SchedulingUtils.lambda$scheduleEager$1(SchedulingUtils.java:194)
>> at
>> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
>> at
>> java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
>> at
>> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>> at
>> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>> at
>> org.apache.flink.runtime.concurrent.FutureUtils$ResultConjunctFuture.handleCompletedFuture(FutureUtils.java:633)
>> at
>> org.apache.flink.runtime.concurrent.FutureUtils$ResultConjunctFuture.lambda$new$0(FutureUtils.java:656)
>> at
>> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>> at
>> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>> at
>> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>> at
>> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>> at
>> org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl.lambda$internalAllocateSlot$0(SchedulerImpl.java:190)
>> at
>> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>> at
>> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>> at
>> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>> at
>> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>> at
>> org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$SingleTaskSlot.release(SlotSharingManager.java:700)
>> at
>> org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$MultiTaskSlot.release(SlotSharingManager.java:484)
>> at
>> org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$MultiTaskSlot.lambda$new$0(SlotSharingManager.java:380)
>> at
>> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
>> at
>> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
>> at
>> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>> at
>> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>> at
>> org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:998)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190)
>> at
>> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>> at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>> at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> at
>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>> at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> at
>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>
>> Thanks,
>> Felipe
>> *--*
>> *-- Felipe Gutierrez*
>>
>> *-- skype: felipe.o.gutierrez*
>> *--* *https://felipeogutierrez.blogspot.com
>> <https://felipeogutierrez.blogspot.com>*
>>
>

Reply via email to