[
https://issues.apache.org/jira/browse/MESOS-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964889#comment-14964889
]
Serhey Novachenko commented on MESOS-3764:
------------------------------------------
Ok, I figured this out. Not a Mesos issue. I've checked mesos master logs and
indeed the resources were released:
{noformat}
I1019 20:07:58.113832 10500 master.cpp:5178] Updating the latest state of task
task-23-74a4f9a9-8952-4068-909f-22c3a220b62a of framework
20151017-130649-2466976940-5050-10486-0012 to TASK_FAILED
I1019 20:07:58.114332 10500 hierarchical.hpp:761] Recovered cpus(*):0.5;
mem(*):1024; ports(*):[31150-31150] (total: ports(*):[4000-7000, 31000-32000];
cpus(*):4; mem(*):13599; disk(*):199666, allocated: cpus(*):3.3; mem(*):8576;
ports(*):[31250-31250, 31350-31350, 4685-4685, 5940-5940]) on slave
20151017-130649-2466976940-5050-10486-S0 from framework
20151017-130649-2466976940-5050-10486-0012
I1019 20:07:58.165314 10502 master.cpp:5246] Removing task
task-23-74a4f9a9-8952-4068-909f-22c3a220b62a with resources cpus(*):0.5;
mem(*):1024; ports(*):[31150-31150] of framework
20151017-130649-2466976940-5050-10486-0012 on slave
20151017-130649-2466976940-5050-10486-S0 at slave(1)@1****8:5051 (ip****)
{noformat}
Then I looked through offers getting received and noticed that this slave
offers exactly one port and others seem to be occupied: cpus:1.00 mem:4096.00
ports:[6810..6810]
It turned out there was another framework that did not decline offers and thus
my framework did not receive necessary resources. After killing that another
framework everything worked fine.
Sorry, my bad
> Port not re-offered after task dies
> -----------------------------------
>
> Key: MESOS-3764
> URL: https://issues.apache.org/jira/browse/MESOS-3764
> Project: Mesos
> Issue Type: Bug
> Components: scheduler driver
> Affects Versions: 0.23.0
> Reporter: Serhey Novachenko
>
> I have a Mesos framework configured to accept a specific port for tasks
> (31150 in my case) and I have amount of tasks == amount of slaves so
> basically I have a task running on each slave on port 31150.
> I have Mesos slaves configured to offer 4000..7000,31000..32000 and I was
> successfully running all tasks until one of them threw an exception and died.
> The framework got the TASK_FAILED status update and I expected the task to be
> relaunched on the same machine and port but instead my framework says no
> offer has port 31150 in it. Is there a case when Mesos does not re-offer the
> port of dead task?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)