Check out this link for info on /tmp cleanup in Ubuntu:
http://askubuntu.com/questions/20783/how-is-the-tmp-directory-cleaned-up
And check out this link for information on some of the work_dir's contents
on a Mesos agent: http://mesos.apache.org/documentation/latest/sandbox/
The work_dir
Hi Pradeep,
And thank you for your reply!
That, too, is very interesting. I think I need to synthesize what you and
Greg are telling me and come up with a clean solution. Agent nodes can
crash. Moreover, I can stop the mesos-slave service, and start it later
with a reboot in between.
So I am
Whoa...interessant!
The node *may* have been rebooted. Uptime says 2 days. I'll need to check
my notes.
Can you point me to reference re Ubuntu behavior?
Based on what you've told me so far, it sounds as if the sequence:
stop service
reboot agent node
start service
could lead to trouble - or
Hello Paul,
>From the logs, it looks like, on starting the mesos slave, it is trying to
do slave recovery (
http://mesos.apache.org/documentation/latest/slave-recovery/) but since the
resources.info is unavailable, it is unable to perform the recovery & hence
end up killing itself.
If you are
Paul,
This would be relevant for any system which is automatically deleting files
in /tmp. It looks like in Ubuntu, the default behavior is for /tmp to be
completely nuked at boot time. Was the agent node rebooted prior to this
problem?
On Tue, Mar 29, 2016 at 2:29 PM, Paul Bell
Hi Greg,
Thanks very much for your quick reply.
I simply forgot to mention platform. It's Ubuntu 14.04 LTS and it's not
systemd. I will look at the link you provide.
Is there any chance that it might apply to non-systemd platforms?
Cordially,
Paul
On Tue, Mar 29, 2016 at 5:18 PM, Greg Mann
Hi Paul,
Noticing the logging output, "Failed to find resources file
'/tmp/mesos/meta/resources/resources.info'", I wonder if your trouble may
be related to the location of your agent's work_dir. See this ticket:
https://issues.apache.org/jira/browse/MESOS-4541
Some users have reported issues
Hi,
I am hoping someone can shed some light on this.
An agent node failed to start, that is, when I did "service mesos-slave
start" the service came up briefly & then stopped. Before stopping it
produced the log shown below. The last thing it wrote is "Trying to create
path '/mesos' in
Hello Erik,
Thank you for clarifying the doubt. That was the exact concern I was having.
On Tue, Mar 29, 2016 at 9:05 PM, Erik Weathers
wrote:
> hi Pradeep,
>
> Yes, that would *definitely* be a problem. e.g., the Storm Framework
> could easily assign Storm Workers to
hi Pradeep,
Yes, that would *definitely* be a problem. e.g., the Storm Framework could
easily assign Storm Workers to use those unavailable ports, and then they
would fail to come up since they wouldn't be able to bind to their assigned
port. I've answered a similar question before:
Hi Klaus,
Thank you for the quick reply.
One quick question:
I have some of the ports like 8400,8500,8600 which are already in use by
consul agent running on each mesos slave. But they are also being announced
by each mesos slave. Will this cause any problem to tasks which maybe
assigned those
Yes, all port resources must be ranges for now, e.g. 31000-35000.
There’s already JIRA (MESOS-4627: Improve Ranges parsing to handle single
values) on that, patches are pending on review :).
Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer Platform OpenSource
Technology, STG, IBM GCG
12 matches
Mail list logo