Re: No offers are being made -- how to debug Mesos?

2020-06-07 Thread Benjamin Wulff
Turns out I had to configure the framework I desire to use to do exactly what 
the mess-execute command did, adding GPU_RESOURCES to the capability list. Now 
resources are offered to the framework and tasks are run. :)

Thanks,
Ben


> On 7. Jun 2020, at 15:01, Benjamin Wulff  wrote:
> 
> Hi all,
> 
> I found the gnu-support site in the docs (1) and tried the following command:
> 
> # mesos-execute --master=129.26.78.161:5050 --name=gpu-test 
> --command="nvidia-smi" --framework_capabilities="GPU_RESOURCES" 
> --resources="gpus:1”
> 
> ..and that gave the following output:
> 
> I0607 14:57:41.897706 56361 scheduler.cpp:189] Version: 1.9.0
> I0607 14:57:41.913520 56361 scheduler.cpp:342] Using default 'basic' HTTP 
> authenticatee
> I0607 14:57:41.913813 56367 scheduler.cpp:525] New master detected at 
> master@129.26.78.161 :5050
> Subscribed with ID f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005
> Submitted task 'gpu-test' to agent 'f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-S0'
> Received status update TASK_STARTING for task 'gpu-test'
>   source: SOURCE_EXECUTOR
> Received status update TASK_RUNNING for task 'gpu-test'
>   source: SOURCE_EXECUTOR
> Received status update TASK_FINISHED for task 'gpu-test'
>   message: 'Command exited with status 0'
>   source: SOURCE_EXECUTOR
> 
> I did not see the output of nvidia-smi as I should have according to the 
> documentation.
> 
> I have attached the logs of master and agent.
> 
> Thanks,
> Ben
> 
> 
> 
> 
>> On 7. Jun 2020, at 02:43, Benjamin Mahler > > wrote:
>> 
>> Don't worry about that "Ignoring" message on the agent. When the framework 
>> information is updated, the master broadcasts it to the agents, and in this 
>> case the agent doesn't know about the framework since it has no tasks for 
>> it, and so it ignores the updated information.
>> 
>> I can't quite tell from the log snippet you provided. Assuming this is the 
>> only scheduler registered, it should receive offers for all the agents for 
>> the scheduler's roles (in this case, should just be the '*' role).
>> 
>> Some reasons offers might not be sent:
>> 
>> -Framework doesn't have capability required to be offered the agent (e.g. 
>> scheduler doesn't have GPU_RESOURCES when the agent has GPUs).
>> -Framework suppressed its role(s) (doesn't seem to be the case from the log 
>> snippet)
>> -The role has insufficient quota (e.g. if you have set a quota limit for 
>> that role, or if other roles have quota guarantees overcommitting the 
>> cluster)
>> -The agent's resources are reserved to a role.
>> 
>> Can you show us the scheduler code? Can you give us complete logs, along 
>> with results of the agent and master /state endpoints?
>> 
>> 
>> On Sat, Jun 6, 2020 at 8:00 AM Benjamin Wulff > > wrote:
>> So with logging_level set to INFO (and master and slave restarted) I noticed 
>> in /var/log/mesos.INFO on the agent the following line:
>> 
>> I0606 13:46:41.393455 206117 slave.cpp:4222] Ignoring info update for 
>> framework 2777de92-bc91-4e48-9960-bbab05694665- because it does not exist
>> 
>> That is indeed the ID of the framework I’d like to run my task. In the web 
>> UI the framework is listed. So why is the agent saying that it doesn’t 
>> exist? What is the semantic of this message?
>> 
>> On the master in /var/log/mesos-master.INFO the last relevant log lines 
>> (after that comes HTTP requests) are:
>> 
>> I0606 13:50:11.710996 52025 http.cpp:1115] HTTP POST for 
>> /master/api/v1/scheduler from 172.30.0.8:41378  
>> with User-Agent='python-requests/2.23.0'
>> I0606 13:50:11.720903 52025 master.cpp:2670] Received subscription request 
>> for HTTP framework 'Go-Docker'
>> I0606 13:50:11.721153 52025 master.cpp:2742] Subscribing framework 
>> 'Go-Docker' with checkpointing disabled and capabilities [  ]
>> I0606 13:50:11.721993 52025 master.cpp:10847] Adding framework 
>> 2777de92-bc91-4e48-9960-bbab05694665- (Go-Docker) with roles {  } 
>> suppressed
>> I0606 13:50:11.722084 52025 master.cpp:8300] Updating framework 
>> 2777de92-bc91-4e48-9960-bbab05694665- (Go-Docker) with roles {  } 
>> suppressed
>> I0606 13:50:11.722514 52030 hierarchical.cpp:605] Added framework 
>> 2777de92-bc91-4e48-9960-bbab05694665-
>> I0606 13:50:11.722573 52030 hierarchical.cpp:711] Deactivated framework 
>> 2777de92-bc91-4e48-9960-bbab05694665-
>> I0606 13:50:11.722625 52030 hierarchical.cpp:1552] Suppressed offers for 
>> roles {  } of framework 2777de92-bc91-4e48-9960-bbab05694665-
>> I0606 13:50:11.722657 52030 hierarchical.cpp:1592] Unsuppressed offers and 
>> cleared filters for roles {  } of framework 
>> 2777de92-bc91-4e48-9960-bbab05694665-
>> I0606 13:50:11.722703 52030 hierarchical.cpp:681] Activated framework 
>> 2777de92-bc91-4e48-9960-bbab05694665-
>> 
>> From my novice perspective it seems the framework is registered..
>> 
>> Thanks,
>> Ben
>> 
>> 
>> > 

Re: No offers are being made -- how to debug Mesos?

2020-06-07 Thread Benjamin Wulff
Hi all,

a correction:

I saw the correct output of nvidia-smi in the stdout file in the tasks work dir 
on the agent (that was the piece I didn’t get, reading helps!).

So I have to see why the framework doesn’t receive any offers.

Thanks,
Ben
 

> On 7. Jun 2020, at 15:01, Benjamin Wulff  wrote:
> 
> Hi all,
> 
> I found the gnu-support site in the docs (1) and tried the following command:
> 
> # mesos-execute --master=129.26.78.161:5050 --name=gpu-test 
> --command="nvidia-smi" --framework_capabilities="GPU_RESOURCES" 
> --resources="gpus:1”
> 
> ..and that gave the following output:
> 
> I0607 14:57:41.897706 56361 scheduler.cpp:189] Version: 1.9.0
> I0607 14:57:41.913520 56361 scheduler.cpp:342] Using default 'basic' HTTP 
> authenticatee
> I0607 14:57:41.913813 56367 scheduler.cpp:525] New master detected at 
> master@129.26.78.161 :5050
> Subscribed with ID f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005
> Submitted task 'gpu-test' to agent 'f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-S0'
> Received status update TASK_STARTING for task 'gpu-test'
>   source: SOURCE_EXECUTOR
> Received status update TASK_RUNNING for task 'gpu-test'
>   source: SOURCE_EXECUTOR
> Received status update TASK_FINISHED for task 'gpu-test'
>   message: 'Command exited with status 0'
>   source: SOURCE_EXECUTOR
> 
> I did not see the output of nvidia-smi as I should have according to the 
> documentation.
> 
> I have attached the logs of master and agent.
> 
> Thanks,
> Ben
> 
> 
> 
> 
>> On 7. Jun 2020, at 02:43, Benjamin Mahler > > wrote:
>> 
>> Don't worry about that "Ignoring" message on the agent. When the framework 
>> information is updated, the master broadcasts it to the agents, and in this 
>> case the agent doesn't know about the framework since it has no tasks for 
>> it, and so it ignores the updated information.
>> 
>> I can't quite tell from the log snippet you provided. Assuming this is the 
>> only scheduler registered, it should receive offers for all the agents for 
>> the scheduler's roles (in this case, should just be the '*' role).
>> 
>> Some reasons offers might not be sent:
>> 
>> -Framework doesn't have capability required to be offered the agent (e.g. 
>> scheduler doesn't have GPU_RESOURCES when the agent has GPUs).
>> -Framework suppressed its role(s) (doesn't seem to be the case from the log 
>> snippet)
>> -The role has insufficient quota (e.g. if you have set a quota limit for 
>> that role, or if other roles have quota guarantees overcommitting the 
>> cluster)
>> -The agent's resources are reserved to a role.
>> 
>> Can you show us the scheduler code? Can you give us complete logs, along 
>> with results of the agent and master /state endpoints?
>> 
>> 
>> On Sat, Jun 6, 2020 at 8:00 AM Benjamin Wulff > > wrote:
>> So with logging_level set to INFO (and master and slave restarted) I noticed 
>> in /var/log/mesos.INFO on the agent the following line:
>> 
>> I0606 13:46:41.393455 206117 slave.cpp:4222] Ignoring info update for 
>> framework 2777de92-bc91-4e48-9960-bbab05694665- because it does not exist
>> 
>> That is indeed the ID of the framework I’d like to run my task. In the web 
>> UI the framework is listed. So why is the agent saying that it doesn’t 
>> exist? What is the semantic of this message?
>> 
>> On the master in /var/log/mesos-master.INFO the last relevant log lines 
>> (after that comes HTTP requests) are:
>> 
>> I0606 13:50:11.710996 52025 http.cpp:1115] HTTP POST for 
>> /master/api/v1/scheduler from 172.30.0.8:41378  
>> with User-Agent='python-requests/2.23.0'
>> I0606 13:50:11.720903 52025 master.cpp:2670] Received subscription request 
>> for HTTP framework 'Go-Docker'
>> I0606 13:50:11.721153 52025 master.cpp:2742] Subscribing framework 
>> 'Go-Docker' with checkpointing disabled and capabilities [  ]
>> I0606 13:50:11.721993 52025 master.cpp:10847] Adding framework 
>> 2777de92-bc91-4e48-9960-bbab05694665- (Go-Docker) with roles {  } 
>> suppressed
>> I0606 13:50:11.722084 52025 master.cpp:8300] Updating framework 
>> 2777de92-bc91-4e48-9960-bbab05694665- (Go-Docker) with roles {  } 
>> suppressed
>> I0606 13:50:11.722514 52030 hierarchical.cpp:605] Added framework 
>> 2777de92-bc91-4e48-9960-bbab05694665-
>> I0606 13:50:11.722573 52030 hierarchical.cpp:711] Deactivated framework 
>> 2777de92-bc91-4e48-9960-bbab05694665-
>> I0606 13:50:11.722625 52030 hierarchical.cpp:1552] Suppressed offers for 
>> roles {  } of framework 2777de92-bc91-4e48-9960-bbab05694665-
>> I0606 13:50:11.722657 52030 hierarchical.cpp:1592] Unsuppressed offers and 
>> cleared filters for roles {  } of framework 
>> 2777de92-bc91-4e48-9960-bbab05694665-
>> I0606 13:50:11.722703 52030 hierarchical.cpp:681] Activated framework 
>> 2777de92-bc91-4e48-9960-bbab05694665-
>> 
>> From my novice perspective it seems the framework is registered..
>> 
>> Thanks,
>> Ben

Re: No offers are being made -- how to debug Mesos?

2020-06-07 Thread Benjamin Wulff
Hi all,I found the gnu-support site in the docs (1) and tried the following command:# mesos-execute --master=129.26.78.161:5050 --name=gpu-test --command="nvidia-smi" --framework_capabilities="GPU_RESOURCES" --resources="gpus:1”..and that gave the following output:I0607 14:57:41.897706 56361 scheduler.cpp:189] Version: 1.9.0I0607 14:57:41.913520 56361 scheduler.cpp:342] Using default 'basic' HTTP authenticateeI0607 14:57:41.913813 56367 scheduler.cpp:525] New master detected at master@129.26.78.161:5050Subscribed with ID f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005Submitted task 'gpu-test' to agent 'f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-S0'Received status update TASK_STARTING for task 'gpu-test'  source: SOURCE_EXECUTORReceived status update TASK_RUNNING for task 'gpu-test'  source: SOURCE_EXECUTORReceived status update TASK_FINISHED for task 'gpu-test'  message: 'Command exited with status 0'  source: SOURCE_EXECUTORI did not see the output of nvidia-smi as I should have according to the documentation.I have attached the logs of master and agent.Thanks,BenI0607 14:57:41.926017 54652 http.cpp:1115] HTTP POST for 
/master/api/v1/scheduler from 129.26.78.161:45512
I0607 14:57:41.927448 54652 master.cpp:2670] Received subscription request for 
HTTP framework 'mesos-execute instance'
I0607 14:57:41.927587 54652 master.cpp:2742] Subscribing framework 
'mesos-execute instance' with checkpointing disabled and capabilities [ 
RESERVATION_REFINEMENT, TASK_KILLING_STATE, REVOCABLE_RESOURCES, 
PARTITION_AWARE, GPU_RESOURCES ]
I0607 14:57:41.928453 54652 master.cpp:10847] Adding framework 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005 (mesos-execute instance) with roles { 
 } suppressed
I0607 14:57:41.928616 54651 hierarchical.cpp:605] Added framework 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005
I0607 14:57:41.929093 54655 master.cpp:10432] Sending offers [ 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-O4 ] to framework 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005 (mesos-execute instance)
I0607 14:57:41.930851 54657 http.cpp:1115] HTTP POST for 
/master/api/v1/scheduler from 129.26.78.161:45510
I0607 14:57:41.931084 54657 master.cpp:12724] Removing offer 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-O4
I0607 14:57:41.931234 54657 master.cpp:4741] Processing ACCEPT call for offers: 
[ f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-O4 ] on agent 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-S0 at slave(1)@10.116.24.18:5051 (node-01) 
for framework f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005 (mesos-execute instance)
I0607 14:57:41.931506 54657 master.cpp:4302] Adding task gpu-test with 
resources gpus(allocated: *):1 of framework 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005 (mesos-execute instance) on agent 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-S0 at slave(1)@10.116.24.18:5051 (node-01)
I0607 14:57:41.931610 54657 master.cpp:5720] Launching task gpu-test of 
framework f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005 (mesos-execute instance) 
with resources 
[{"allocation_info":{"role":"*"},"name":"gpus","scalar":{"value":1.0},"type":"SCALAR"}]
 on agent f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-S0 at slave(1)@10.116.24.18:5051 
(node-01) on  new executor
I0607 14:57:42.169665 54640 master.cpp:8985] Status update TASK_STARTING 
(Status UUID: e483ba5a-afe1-4306-b994-accbfedaae52) for task gpu-test of 
framework f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005 from agent 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-S0 at slave(1)@10.116.24.18:5051 (node-01)
I0607 14:57:42.169793 54640 master.cpp:9042] Forwarding status update 
TASK_STARTING (Status UUID: e483ba5a-afe1-4306-b994-accbfedaae52) for task 
gpu-test of framework f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005
I0607 14:57:42.169997 54640 master.cpp:12073] Updating the state of task 
gpu-test of framework f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005 (latest state: 
TASK_STARTING, status update state: TASK_STARTING)
I0607 14:57:42.211457 54642 http.cpp:1115] HTTP POST for 
/master/api/v1/scheduler from 129.26.78.161:45510
I0607 14:57:42.211583 54642 master.cpp:6695] Processing ACKNOWLEDGE call for 
status e483ba5a-afe1-4306-b994-accbfedaae52 for task gpu-test of framework 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005 (mesos-execute instance) on agent 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-S0
I0607 14:57:42.212878 54659 master.cpp:8985] Status update TASK_RUNNING (Status 
UUID: a870507f-82a6-4389-ac3c-064b386abfcf) for task gpu-test of framework 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005 from agent 
f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-S0 at slave(1)@10.116.24.18:5051 (node-01)
I0607 14:57:42.212954 54659 master.cpp:9042] Forwarding status update 
TASK_RUNNING (Status UUID: a870507f-82a6-4389-ac3c-064b386abfcf) for task 
gpu-test of framework f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005
I0607 14:57:42.213131 54659 master.cpp:12073] Updating the state of task 
gpu-test of framework f2e21b9a-3bb6-4f40-bfe3-3ec4f8eda64a-0005 (latest state: 
TASK_RUNNING, status update state: TASK_RUNNING)
I0607 14:57:42.254524 54651 http.cpp:1115] HTTP POST for 

Re: No offers are being made -- how to debug Mesos?

2020-06-07 Thread Benjamin Wulff
Hi Benjamin,I can't quite tell from the log snippet you provided. Assuming this is the only scheduler registered, it should receive offers for all the agents for the scheduler's roles (in this case, should just be the '*' role).The framework I was talking about is the only framework in place. I can confirm that the role of the framework is ‘*’ (framework’s page in the web ui).Some reasons offers might not be sent:-Framework doesn't have capability required to be offered the agent (e.g. scheduler doesn't have GPU_RESOURCES when the agent has GPUs).The ‘capabilities’ field in the framework information (http://mster:5050/framework/) is empty. I have attached the framework information json.  -Framework suppressed its role(s) (doesn't seem to be the case from the log snippet)-The role has insufficient quota (e.g. if you have set a quota limit for that role, or if other roles have quota guarantees overcommitting the cluster)I have not (knowingly) set any quotas.-The agent's resources are reserved to a role.I have not (knowingly) made any resource reservations. Generally: What I have done is installed Zookeeper and Mesos, did basic configuration of master and agent and started, in the that order: Zookeeper, Memos-Master, Mesos-Slave, the framework. There was nothing more going on. Can you show us the scheduler code? Can you give us complete logs, along with results of the agent and master /state endpoints?I have attached the two state info json files and the framework info son as well as the logs from master and agent.Thanks and best regards,Ben 

framework.json
Description: application/json


mesos-master.mp-weizenbaum.iais.fraunhofer.de.root.log.INFO.20200607-131745.54624
Description: Binary data


mesos-master.mp-weizenbaum.iais.fraunhofer.de.root.log.WARNING.20200607-131745.54624
Description: Binary data


mesos-master.state.json
Description: application/json


mesos-slave.node-01.root.log.INFO.20200607-131429.226876
Description: Binary data


mesos-slave.state.json
Description: application/json
On Sat, Jun 6, 2020 at 8:00 AM Benjamin Wulff  wrote:So with logging_level set to INFO (and master and slave restarted) I noticed in /var/log/mesos.INFO on the agent the following line:

I0606 13:46:41.393455 206117 slave.cpp:4222] Ignoring info update for framework 2777de92-bc91-4e48-9960-bbab05694665- because it does not exist

That is indeed the ID of the framework I’d like to run my task. In the web UI the framework is listed. So why is the agent saying that it doesn’t exist? What is the semantic of this message?

On the master in /var/log/mesos-master.INFO the last relevant log lines (after that comes HTTP requests) are:

I0606 13:50:11.710996 52025 http.cpp:1115] HTTP POST for /master/api/v1/scheduler from 172.30.0.8:41378 with User-Agent='python-requests/2.23.0'
I0606 13:50:11.720903 52025 master.cpp:2670] Received subscription request for HTTP framework 'Go-Docker'
I0606 13:50:11.721153 52025 master.cpp:2742] Subscribing framework 'Go-Docker' with checkpointing disabled and capabilities [  ]
I0606 13:50:11.721993 52025 master.cpp:10847] Adding framework 2777de92-bc91-4e48-9960-bbab05694665- (Go-Docker) with roles {  } suppressed
I0606 13:50:11.722084 52025 master.cpp:8300] Updating framework 2777de92-bc91-4e48-9960-bbab05694665- (Go-Docker) with roles {  } suppressed
I0606 13:50:11.722514 52030 hierarchical.cpp:605] Added framework 2777de92-bc91-4e48-9960-bbab05694665-
I0606 13:50:11.722573 52030 hierarchical.cpp:711] Deactivated framework 2777de92-bc91-4e48-9960-bbab05694665-
I0606 13:50:11.722625 52030 hierarchical.cpp:1552] Suppressed offers for roles {  } of framework 2777de92-bc91-4e48-9960-bbab05694665-
I0606 13:50:11.722657 52030 hierarchical.cpp:1592] Unsuppressed offers and cleared filters for roles {  } of framework 2777de92-bc91-4e48-9960-bbab05694665-
I0606 13:50:11.722703 52030 hierarchical.cpp:681] Activated framework 2777de92-bc91-4e48-9960-bbab05694665-

From my novice perspective it seems the framework is registered..

Thanks,
Ben


> On 6. Jun 2020, at 13:40, Marc Roos  wrote:
> 
> 
> 
> 
> You already put these on debug?
> 
> [@ ]# cat /etc/mesos-master/logging_level
> WARNING
> [@ ]# cat /etc/mesos-slave/logging_level
> WARNING
> 
> 
> 
> 
> -Original Message-
> From: Benjamin Wulff [mailto:benjamin.wulff...@ieee.org] 
> Sent: zaterdag 6 juni 2020 13:36
> To: user@mesos.apache.org
> Subject: No offers are being made -- how to debug Mesos?
> 
> Hi all,
> 
> I’m in the process of setting up my first Mesos cluster with 1x master 
> and 3x slaves on CentOS 8.
> 
> So far set up Zookeepr and Mesos-master on the master and Mesos-slave on 
> one of the compute nodes. Mesos-master communicates with ZK and becomes 
> leader. Then I started memos-slave on the compute node and can see in 
> the log that it registers at the master with the correct resources 
> reported. The agent and its resources are also 

Re: No offers are being made -- how to debug Mesos?

2020-06-06 Thread Benjamin Mahler
Don't worry about that "Ignoring" message on the agent. When the framework
information is updated, the master broadcasts it to the agents, and in this
case the agent doesn't know about the framework since it has no tasks for
it, and so it ignores the updated information.

I can't quite tell from the log snippet you provided. Assuming this is the
only scheduler registered, it should receive offers for all the agents for
the scheduler's roles (in this case, should just be the '*' role).

Some reasons offers might not be sent:

-Framework doesn't have capability required to be offered the agent (e.g.
scheduler doesn't have GPU_RESOURCES when the agent has GPUs).
-Framework suppressed its role(s) (doesn't seem to be the case from the log
snippet)
-The role has insufficient quota (e.g. if you have set a quota limit for
that role, or if other roles have quota guarantees overcommitting the
cluster)
-The agent's resources are reserved to a role.

Can you show us the scheduler code? Can you give us complete logs, along
with results of the agent and master /state endpoints?


On Sat, Jun 6, 2020 at 8:00 AM Benjamin Wulff 
wrote:

> So with logging_level set to INFO (and master and slave restarted) I
> noticed in /var/log/mesos.INFO on the agent the following line:
>
> I0606 13:46:41.393455 206117 slave.cpp:4222] Ignoring info update for
> framework 2777de92-bc91-4e48-9960-bbab05694665- because it does not
> exist
>
> That is indeed the ID of the framework I’d like to run my task. In the web
> UI the framework is listed. So why is the agent saying that it doesn’t
> exist? What is the semantic of this message?
>
> On the master in /var/log/mesos-master.INFO the last relevant log lines
> (after that comes HTTP requests) are:
>
> I0606 13:50:11.710996 52025 http.cpp:1115] HTTP POST for
> /master/api/v1/scheduler from 172.30.0.8:41378 with
> User-Agent='python-requests/2.23.0'
> I0606 13:50:11.720903 52025 master.cpp:2670] Received subscription request
> for HTTP framework 'Go-Docker'
> I0606 13:50:11.721153 52025 master.cpp:2742] Subscribing framework
> 'Go-Docker' with checkpointing disabled and capabilities [  ]
> I0606 13:50:11.721993 52025 master.cpp:10847] Adding framework
> 2777de92-bc91-4e48-9960-bbab05694665- (Go-Docker) with roles {  }
> suppressed
> I0606 13:50:11.722084 52025 master.cpp:8300] Updating framework
> 2777de92-bc91-4e48-9960-bbab05694665- (Go-Docker) with roles {  }
> suppressed
> I0606 13:50:11.722514 52030 hierarchical.cpp:605] Added framework
> 2777de92-bc91-4e48-9960-bbab05694665-
> I0606 13:50:11.722573 52030 hierarchical.cpp:711] Deactivated framework
> 2777de92-bc91-4e48-9960-bbab05694665-
> I0606 13:50:11.722625 52030 hierarchical.cpp:1552] Suppressed offers for
> roles {  } of framework 2777de92-bc91-4e48-9960-bbab05694665-
> I0606 13:50:11.722657 52030 hierarchical.cpp:1592] Unsuppressed offers and
> cleared filters for roles {  } of framework
> 2777de92-bc91-4e48-9960-bbab05694665-
> I0606 13:50:11.722703 52030 hierarchical.cpp:681] Activated framework
> 2777de92-bc91-4e48-9960-bbab05694665-
>
> From my novice perspective it seems the framework is registered..
>
> Thanks,
> Ben
>
>
> > On 6. Jun 2020, at 13:40, Marc Roos  wrote:
> >
> >
> >
> >
> > You already put these on debug?
> >
> > [@ ]# cat /etc/mesos-master/logging_level
> > WARNING
> > [@ ]# cat /etc/mesos-slave/logging_level
> > WARNING
> >
> >
> >
> >
> > -Original Message-
> > From: Benjamin Wulff [mailto:benjamin.wulff...@ieee.org]
> > Sent: zaterdag 6 juni 2020 13:36
> > To: user@mesos.apache.org
> > Subject: No offers are being made -- how to debug Mesos?
> >
> > Hi all,
> >
> > I’m in the process of setting up my first Mesos cluster with 1x master
> > and 3x slaves on CentOS 8.
> >
> > So far set up Zookeepr and Mesos-master on the master and Mesos-slave on
> > one of the compute nodes. Mesos-master communicates with ZK and becomes
> > leader. Then I started memos-slave on the compute node and can see in
> > the log that it registers at the master with the correct resources
> > reported. The agent and its resources are also displayed in the web UI
> > of the master. So is the framework that I want to use.
> >
> > The crux is that no tasks I schedule in the framework are executed. And
> > I suppose this is because the framework never receives an offer. I can
> > see in the web UI that no offers are made and that all resources remain
> > idle.
> >
> > Now, I’m new to Mesos and I don’t really have an idea how to debug my
> > setup at this point.
> >
> > There is a page called ‘Debugging with the new CLI’ in the
> > documentation but it only explains how to configure  the CLI command.
> >
> > Any directions how to debug in my situation in general or on how to use
> > the CLI for debugging would be highly welcome! :)
> >
> > Thanks and best regards,
> > Ben
> >
> >
> >
>
>


Re: No offers are being made -- how to debug Mesos?

2020-06-06 Thread Benjamin Wulff
So with logging_level set to INFO (and master and slave restarted) I noticed in 
/var/log/mesos.INFO on the agent the following line:

I0606 13:46:41.393455 206117 slave.cpp:4222] Ignoring info update for framework 
2777de92-bc91-4e48-9960-bbab05694665- because it does not exist

That is indeed the ID of the framework I’d like to run my task. In the web UI 
the framework is listed. So why is the agent saying that it doesn’t exist? What 
is the semantic of this message?

On the master in /var/log/mesos-master.INFO the last relevant log lines (after 
that comes HTTP requests) are:

I0606 13:50:11.710996 52025 http.cpp:1115] HTTP POST for 
/master/api/v1/scheduler from 172.30.0.8:41378 with 
User-Agent='python-requests/2.23.0'
I0606 13:50:11.720903 52025 master.cpp:2670] Received subscription request for 
HTTP framework 'Go-Docker'
I0606 13:50:11.721153 52025 master.cpp:2742] Subscribing framework 'Go-Docker' 
with checkpointing disabled and capabilities [  ]
I0606 13:50:11.721993 52025 master.cpp:10847] Adding framework 
2777de92-bc91-4e48-9960-bbab05694665- (Go-Docker) with roles {  } suppressed
I0606 13:50:11.722084 52025 master.cpp:8300] Updating framework 
2777de92-bc91-4e48-9960-bbab05694665- (Go-Docker) with roles {  } suppressed
I0606 13:50:11.722514 52030 hierarchical.cpp:605] Added framework 
2777de92-bc91-4e48-9960-bbab05694665-
I0606 13:50:11.722573 52030 hierarchical.cpp:711] Deactivated framework 
2777de92-bc91-4e48-9960-bbab05694665-
I0606 13:50:11.722625 52030 hierarchical.cpp:1552] Suppressed offers for roles 
{  } of framework 2777de92-bc91-4e48-9960-bbab05694665-
I0606 13:50:11.722657 52030 hierarchical.cpp:1592] Unsuppressed offers and 
cleared filters for roles {  } of framework 
2777de92-bc91-4e48-9960-bbab05694665-
I0606 13:50:11.722703 52030 hierarchical.cpp:681] Activated framework 
2777de92-bc91-4e48-9960-bbab05694665-

From my novice perspective it seems the framework is registered..

Thanks,
Ben


> On 6. Jun 2020, at 13:40, Marc Roos  wrote:
> 
> 
> 
> 
> You already put these on debug?
> 
> [@ ]# cat /etc/mesos-master/logging_level
> WARNING
> [@ ]# cat /etc/mesos-slave/logging_level
> WARNING
> 
> 
> 
> 
> -Original Message-
> From: Benjamin Wulff [mailto:benjamin.wulff...@ieee.org] 
> Sent: zaterdag 6 juni 2020 13:36
> To: user@mesos.apache.org
> Subject: No offers are being made -- how to debug Mesos?
> 
> Hi all,
> 
> I’m in the process of setting up my first Mesos cluster with 1x master 
> and 3x slaves on CentOS 8.
> 
> So far set up Zookeepr and Mesos-master on the master and Mesos-slave on 
> one of the compute nodes. Mesos-master communicates with ZK and becomes 
> leader. Then I started memos-slave on the compute node and can see in 
> the log that it registers at the master with the correct resources 
> reported. The agent and its resources are also displayed in the web UI 
> of the master. So is the framework that I want to use.
> 
> The crux is that no tasks I schedule in the framework are executed. And 
> I suppose this is because the framework never receives an offer. I can 
> see in the web UI that no offers are made and that all resources remain 
> idle.
> 
> Now, I’m new to Mesos and I don’t really have an idea how to debug my 
> setup at this point. 
> 
> There is a page called ‘Debugging with the new CLI’ in the 
> documentation but it only explains how to configure  the CLI command. 
> 
> Any directions how to debug in my situation in general or on how to use 
> the CLI for debugging would be highly welcome! :)
> 
> Thanks and best regards,
> Ben
> 
> 
> 



RE: No offers are being made -- how to debug Mesos?

2020-06-06 Thread Marc Roos
 

 
You already put these on debug?

[@ ]# cat /etc/mesos-master/logging_level
WARNING
[@ ]# cat /etc/mesos-slave/logging_level
WARNING




-Original Message-
From: Benjamin Wulff [mailto:benjamin.wulff...@ieee.org] 
Sent: zaterdag 6 juni 2020 13:36
To: user@mesos.apache.org
Subject: No offers are being made -- how to debug Mesos?

Hi all,

I’m in the process of setting up my first Mesos cluster with 1x master 
and 3x slaves on CentOS 8.

So far set up Zookeepr and Mesos-master on the master and Mesos-slave on 
one of the compute nodes. Mesos-master communicates with ZK and becomes 
leader. Then I started memos-slave on the compute node and can see in 
the log that it registers at the master with the correct resources 
reported. The agent and its resources are also displayed in the web UI 
of the master. So is the framework that I want to use.

The crux is that no tasks I schedule in the framework are executed. And 
I suppose this is because the framework never receives an offer. I can 
see in the web UI that no offers are made and that all resources remain 
idle.

Now, I’m new to Mesos and I don’t really have an idea how to debug my 
setup at this point. 

There is a page called ‘Debugging with the new CLI’ in the 
documentation but it only explains how to configure  the CLI command. 

Any directions how to debug in my situation in general or on how to use 
the CLI for debugging would be highly welcome! :)

Thanks and best regards,
Ben