Hi Benjamin,


I can't quite tell from the log snippet you provided. Assuming this is the only scheduler registered, it should receive offers for all the agents for the scheduler's roles (in this case, should just be the '*' role).


The framework I was talking about is the only framework in place. I can confirm that the role of the framework is ‘*’ (framework’s page in the web ui).

Some reasons offers might not be sent:

-Framework doesn't have capability required to be offered the agent (e.g. scheduler doesn't have GPU_RESOURCES when the agent has GPUs).

The ‘capabilities’ field in the framework information (http://mster:5050/framework/<id>) is empty. I have attached the framework information json.  

-Framework suppressed its role(s) (doesn't seem to be the case from the log snippet)
-The role has insufficient quota (e.g. if you have set a quota limit for that role, or if other roles have quota guarantees overcommitting the cluster)

I have not (knowingly) set any quotas.

-The agent's resources are reserved to a role.

I have not (knowingly) made any resource reservations. 

Generally: What I have done is installed Zookeeper and Mesos, did basic configuration of master and agent and started, in the that order: Zookeeper, Memos-Master, Mesos-Slave, the framework. There was nothing more going on. 


Can you show us the scheduler code? Can you give us complete logs, along with results of the agent and master /state endpoints?

I have attached the two state info json files and the framework info son as well as the logs from master and agent.

Thanks and best regards,
Ben 

Attachment: framework.json
Description: application/json

Attachment: mesos-master.mp-weizenbaum.iais.fraunhofer.de.root.log.INFO.20200607-131745.54624
Description: Binary data

Attachment: mesos-master.mp-weizenbaum.iais.fraunhofer.de.root.log.WARNING.20200607-131745.54624
Description: Binary data

Attachment: mesos-master.state.json
Description: application/json

Attachment: mesos-slave.node-01.root.log.INFO.20200607-131429.226876
Description: Binary data

Attachment: mesos-slave.state.json
Description: application/json



On Sat, Jun 6, 2020 at 8:00 AM Benjamin Wulff <benjamin.wulff...@ieee.org> wrote:
So with logging_level set to INFO (and master and slave restarted) I noticed in /var/log/mesos.INFO on the agent the following line:

I0606 13:46:41.393455 206117 slave.cpp:4222] Ignoring info update for framework 2777de92-bc91-4e48-9960-bbab05694665-0000 because it does not exist

That is indeed the ID of the framework I’d like to run my task. In the web UI the framework is listed. So why is the agent saying that it doesn’t exist? What is the semantic of this message?

On the master in /var/log/mesos-master.INFO the last relevant log lines (after that comes HTTP requests) are:

I0606 13:50:11.710996 52025 http.cpp:1115] HTTP POST for /master/api/v1/scheduler from 172.30.0.8:41378 with User-Agent='python-requests/2.23.0'
I0606 13:50:11.720903 52025 master.cpp:2670] Received subscription request for HTTP framework 'Go-Docker'
I0606 13:50:11.721153 52025 master.cpp:2742] Subscribing framework 'Go-Docker' with checkpointing disabled and capabilities [  ]
I0606 13:50:11.721993 52025 master.cpp:10847] Adding framework 2777de92-bc91-4e48-9960-bbab05694665-0000 (Go-Docker) with roles {  } suppressed
I0606 13:50:11.722084 52025 master.cpp:8300] Updating framework 2777de92-bc91-4e48-9960-bbab05694665-0000 (Go-Docker) with roles {  } suppressed
I0606 13:50:11.722514 52030 hierarchical.cpp:605] Added framework 2777de92-bc91-4e48-9960-bbab05694665-0000
I0606 13:50:11.722573 52030 hierarchical.cpp:711] Deactivated framework 2777de92-bc91-4e48-9960-bbab05694665-0000
I0606 13:50:11.722625 52030 hierarchical.cpp:1552] Suppressed offers for roles {  } of framework 2777de92-bc91-4e48-9960-bbab05694665-0000
I0606 13:50:11.722657 52030 hierarchical.cpp:1592] Unsuppressed offers and cleared filters for roles {  } of framework 2777de92-bc91-4e48-9960-bbab05694665-0000
I0606 13:50:11.722703 52030 hierarchical.cpp:681] Activated framework 2777de92-bc91-4e48-9960-bbab05694665-0000

From my novice perspective it seems the framework is registered..

Thanks,
Ben


> On 6. Jun 2020, at 13:40, Marc Roos <m.r...@f1-outsourcing.eu> wrote:
>
>
>
>
> You already put these on debug?
>
> [@ ]# cat /etc/mesos-master/logging_level
> WARNING
> [@ ]# cat /etc/mesos-slave/logging_level
> WARNING
>
>
>
>
> -----Original Message-----
> From: Benjamin Wulff [mailto:benjamin.wulff...@ieee.org]
> Sent: zaterdag 6 juni 2020 13:36
> To: user@mesos.apache.org
> Subject: No offers are being made -- how to debug Mesos?
>
> Hi all,
>
> I’m in the process of setting up my first Mesos cluster with 1x master
> and 3x slaves on CentOS 8.
>
> So far set up Zookeepr and Mesos-master on the master and Mesos-slave on
> one of the compute nodes. Mesos-master communicates with ZK and becomes
> leader. Then I started memos-slave on the compute node and can see in
> the log that it registers at the master with the correct resources
> reported. The agent and its resources are also displayed in the web UI
> of the master. So is the framework that I want to use.
>
> The crux is that no tasks I schedule in the framework are executed. And
> I suppose this is because the framework never receives an offer. I can
> see in the web UI that no offers are made and that all resources remain
> idle.
>
> Now, I’m new to Mesos and I don’t really have an idea how to debug my
> setup at this point.
>
> There is a page called ‘Debugging with the new CLI’ in the
> documentation but it only explains how to configure  the CLI command.
>
> Any directions how to debug in my situation in general or on how to use
> the CLI for debugging would be highly welcome! :)
>
> Thanks and best regards,
> Ben
>
>
>


Reply via email to