Ok. How do you typically monitor the messages between Master and Agents? Do you have some tools for this on the cluster?
On Tue, Oct 4, 2016 at 6:21 PM, haosdent <[email protected]> wrote: > Hi, @Frank Thanks for your information > >> I see messages 'Telling agent (...) to kill task (...)'. Why does this >> happen? > This should because your framework send a `KillTaskMessage` or > `scheduler::Call::KILL` request to the Mesos Master, then the Mesos is going > to kill your task. > >>Is this the exact text to search for or is this the name of the protobuf >> message? Are these logged on a higher log level? > it exists in the log of the agents. It looks like > ``` > I1004 23:19:36.175673 45405 slave.cpp:1539] Got assigned task '1' for > framework e7287433-36f9-48dd-8633-8a6ac7083a43-0000 > I1004 23:19:36.176206 45405 slave.cpp:1696] Launching task '1' for framework > e7287433-36f9-48dd-8633-8a6ac7083a43-0000 > ``` > Usually, you could grep your task id in the agent log to see how the task > failed. > > > > On Tue, Oct 4, 2016 at 8:50 PM, Frank Scholten <[email protected]> > wrote: >> >> Thanks Haosdent for your quick response. >> >> I added GLOG_v=1 to the master and agents. >> >> 1. The framework is registered. Marathon in this case. >> 2. I see messages 'Telling agent (...) to kill task (...)'. Why does >> this happen? I also see 'Sending explicit reconciliation state >> TASK_LOST for task fake-marathon-pacemaker-task-(...)'. >> 3. I searched for RunTaskMessage in the agent log but could not find >> it. Is this the exact text to search for or is this the name of the >> protobuf message? Are these logged on a higher log level? >> >> On Tue, Oct 4, 2016 at 11:22 AM, haosdent <[email protected]> wrote: >> > staging is the initialize status of the task. I think you may your logs >> > via >> > these steps: >> > >> > 1. If your framework registered successfully in the master? >> > 2. If the master send resources offers to your framework and your >> > framework >> > accept it? >> > 3. If your agents receive the RunTaskMessage from master to launch your >> > task? >> > >> > In additionally, use `export GLOG_v=1` before start masters and agents >> > may >> > helpful for your troubleshooting. >> > >> > On Tue, Oct 4, 2016 at 4:58 PM, Frank Scholten <[email protected]> >> > wrote: >> >> >> >> Hi all, >> >> >> >> I am looking for some ways to troubleshoot or debug tasks that are >> >> stuck in the 'staging' state. Typically they have no logs in the >> >> sandbox. >> >> >> >> Are there are any endpoints or things to look for in logs to identify >> >> a root cause? >> >> >> >> Is there a troubleshooting guide for Mesos to solve problems like this? >> >> >> >> Cheers, >> >> >> >> Frank >> > >> > >> > >> > >> > -- >> > Best Regards, >> > Haosdent Huang > > > > > -- > Best Regards, > Haosdent Huang

