> How do you typically monitor the messages between Master and Agents?
For my side, I didn't monitor this. And only check the logs when
troubleshooting some problems.
Not sure if other users or developers have tools to meet your requirement
here.

On Wed, Oct 5, 2016 at 8:16 PM, Frank Scholten <[email protected]>
wrote:

> Ok. How do you typically monitor the messages between Master and
> Agents? Do you have some tools for this on the cluster?
>
> On Tue, Oct 4, 2016 at 6:21 PM, haosdent <[email protected]> wrote:
> > Hi, @Frank Thanks for your information
> >
> >> I see messages 'Telling agent (...) to kill task (...)'. Why does this
> >> happen?
> > This should because your framework send a `KillTaskMessage` or
> > `scheduler::Call::KILL` request to the Mesos Master, then the Mesos is
> going
> > to kill your task.
> >
> >>Is this the exact text to search for or is this the name of the protobuf
> >> message? Are these logged on a higher log level?
> > it exists in the log of the agents. It looks like
> > ```
> > I1004 23:19:36.175673 45405 slave.cpp:1539] Got assigned task '1' for
> > framework e7287433-36f9-48dd-8633-8a6ac7083a43-0000
> > I1004 23:19:36.176206 45405 slave.cpp:1696] Launching task '1' for
> framework
> > e7287433-36f9-48dd-8633-8a6ac7083a43-0000
> > ```
> > Usually, you could grep your task id in the agent log to see how the task
> > failed.
> >
> >
> >
> > On Tue, Oct 4, 2016 at 8:50 PM, Frank Scholten <[email protected]>
> > wrote:
> >>
> >> Thanks Haosdent for your quick response.
> >>
> >> I added GLOG_v=1 to the master and agents.
> >>
> >> 1. The framework is registered. Marathon in this case.
> >> 2. I see messages 'Telling agent (...) to kill task (...)'. Why does
> >> this happen? I also see 'Sending explicit reconciliation state
> >> TASK_LOST for task fake-marathon-pacemaker-task-(...)'.
> >> 3. I searched for RunTaskMessage in the agent log but could not find
> >> it. Is this the exact text to search for or is this the name of the
> >> protobuf message? Are these logged on a higher log level?
> >>
> >> On Tue, Oct 4, 2016 at 11:22 AM, haosdent <[email protected]> wrote:
> >> > staging is the initialize status of the task. I think you may your
> logs
> >> > via
> >> > these steps:
> >> >
> >> > 1. If your framework registered successfully in the master?
> >> > 2. If the master send resources offers to your framework and your
> >> > framework
> >> > accept it?
> >> > 3. If your agents receive the RunTaskMessage from master to launch
> your
> >> > task?
> >> >
> >> > In additionally, use `export GLOG_v=1` before start masters and agents
> >> > may
> >> > helpful for your troubleshooting.
> >> >
> >> > On Tue, Oct 4, 2016 at 4:58 PM, Frank Scholten <
> [email protected]>
> >> > wrote:
> >> >>
> >> >> Hi all,
> >> >>
> >> >> I am looking for some ways to troubleshoot or debug tasks that are
> >> >> stuck in the 'staging' state. Typically they have no logs in the
> >> >> sandbox.
> >> >>
> >> >> Are there are any endpoints or things to look for in logs to identify
> >> >> a root cause?
> >> >>
> >> >> Is there a troubleshooting guide for Mesos to solve problems like
> this?
> >> >>
> >> >> Cheers,
> >> >>
> >> >> Frank
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Best Regards,
> >> > Haosdent Huang
> >
> >
> >
> >
> > --
> > Best Regards,
> > Haosdent Huang
>



-- 
Best Regards,
Haosdent Huang

Reply via email to