Re: Agent won't start

2016-03-29 Thread Greg Mann
Check out this link for info on /tmp cleanup in Ubuntu: http://askubuntu.com/questions/20783/how-is-the-tmp-directory-cleaned-up And check out this link for information on some of the work_dir's contents on a Mesos agent: http://mesos.apache.org/documentation/latest/sandbox/ The work_dir

Re: Agent won't start

2016-03-29 Thread Paul Bell
Hi Pradeep, And thank you for your reply! That, too, is very interesting. I think I need to synthesize what you and Greg are telling me and come up with a clean solution. Agent nodes can crash. Moreover, I can stop the mesos-slave service, and start it later with a reboot in between. So I am

Re: Agent won't start

2016-03-29 Thread Paul Bell
Whoa...interessant! The node *may* have been rebooted. Uptime says 2 days. I'll need to check my notes. Can you point me to reference re Ubuntu behavior? Based on what you've told me so far, it sounds as if the sequence: stop service reboot agent node start service could lead to trouble - or

Re: Agent won't start

2016-03-29 Thread Pradeep Chhetri
Hello Paul, >From the logs, it looks like, on starting the mesos slave, it is trying to do slave recovery ( http://mesos.apache.org/documentation/latest/slave-recovery/) but since the resources.info is unavailable, it is unable to perform the recovery & hence end up killing itself. If you are

Re: Agent won't start

2016-03-29 Thread Greg Mann
Paul, This would be relevant for any system which is automatically deleting files in /tmp. It looks like in Ubuntu, the default behavior is for /tmp to be completely nuked at boot time. Was the agent node rebooted prior to this problem? On Tue, Mar 29, 2016 at 2:29 PM, Paul Bell

Re: Agent won't start

2016-03-29 Thread Paul Bell
Hi Greg, Thanks very much for your quick reply. I simply forgot to mention platform. It's Ubuntu 14.04 LTS and it's not systemd. I will look at the link you provide. Is there any chance that it might apply to non-systemd platforms? Cordially, Paul On Tue, Mar 29, 2016 at 5:18 PM, Greg Mann

Re: Agent won't start

2016-03-29 Thread Greg Mann
Hi Paul, Noticing the logging output, "Failed to find resources file '/tmp/mesos/meta/resources/resources.info'", I wonder if your trouble may be related to the location of your agent's work_dir. See this ticket: https://issues.apache.org/jira/browse/MESOS-4541 Some users have reported issues

Agent won't start

2016-03-29 Thread Paul Bell
Hi, I am hoping someone can shed some light on this. An agent node failed to start, that is, when I did "service mesos-slave start" the service came up briefly & then stopped. Before stopping it produced the log shown below. The last thing it wrote is "Trying to create path '/mesos' in

Re: Port Resource Offers

2016-03-29 Thread Pradeep Chhetri
Hello Erik, Thank you for clarifying the doubt. That was the exact concern I was having. On Tue, Mar 29, 2016 at 9:05 PM, Erik Weathers wrote: > hi Pradeep, > > Yes, that would *definitely* be a problem. e.g., the Storm Framework > could easily assign Storm Workers to

Re: Port Resource Offers

2016-03-29 Thread Erik Weathers
hi Pradeep, Yes, that would *definitely* be a problem. e.g., the Storm Framework could easily assign Storm Workers to use those unavailable ports, and then they would fail to come up since they wouldn't be able to bind to their assigned port. I've answered a similar question before:

Re: Port Resource Offers

2016-03-29 Thread Pradeep Chhetri
Hi Klaus, Thank you for the quick reply. One quick question: I have some of the ports like 8400,8500,8600 which are already in use by consul agent running on each mesos slave. But they are also being announced by each mesos slave. Will this cause any problem to tasks which maybe assigned those

RE: Port Resource Offers

2016-03-29 Thread Klaus Ma
Yes, all port resources must be ranges for now, e.g. 31000-35000. There’s already JIRA (MESOS-4627: Improve Ranges parsing to handle single values) on that, patches are pending on review :). Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer Platform OpenSource Technology, STG, IBM GCG