[
https://issues.apache.org/jira/browse/MESOS-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Mahler updated MESOS-1648:
-----------------------------------
Description:
Right now we use a number of wrapper scripts to try and keep up a
{{/var/run/mesos/mesos-slave.pid}} in order to be able to monitor the process.
This has proven to be somewhat fragile due to the lack of locking and the
possibility of races and stale data.
By adding a {{--pidfile}}, we can obtain a lock on the file to prevent multiple
binaries from starting, and to enable the tooling to validate that the lock is
held before doing any signaling. We can also do a best effort unlink in the
signal handler upon termination:
{code}
// Get exclusive access to the file.
fd = open(O_CREAT ...)
flock(fd, LOCK_EX)
if not locked, abort
ftruncate(fd, 0)
// Write the pid.
write(fd, "<pid>")
// Inside signal handler..
unlink(pidfile)
{code}
Digging around, looks like the open, ftruncate, write pattern is pretty common:
http://man7.org/tlpi/code/online/diff/filelock/create_pid_file.c.html
The tooling around it could that the file is locked by the pid inside it,
before taking any action (like signaling):
*Case 1*: If the file does not exist or is not locked, then assume nothing is
running. It's possible for something to be running and about to grab the lock,
but we'll eventually read it correctly and converge on a single instance
started correctly.
*Case 2*: If the file is locked, and the pid doesn't match, then assume it is
running but not as the pid in the file (.. yet). Treat this the same as (1),
assume it's not running, and the next attempts to start will eventually
converge on a single instance running.
*Case 3*: If the file is locked, and the pid matches the locker process, then
assume it is running as that pid. Note that it's still possible that in between
matching the pid and taking an action (e.g. kill), the pid may become stale,
but the recycling pattern of pids makes it unlikely to be re-used unless there
is a large delay.
It seems like some tools already do this signal wrapping (note the comment
about fcntl and note the race from (3) in the BUGS section):
http://manpages.ubuntu.com/manpages/natty/man8/ovs-kill.8.html
was:
Add a {{--pidfile}} option to the common logging flags. Right now we use a
number of wrapper scripts to try and keep up a
{{/var/run/mesos/mesos-slave.pid}} in order to be able to monitor the process.
It would be nice if this extra (somewhat fragile) wrapper was not necessary.
Following implementation of this command line option, consider adding automatic
removal of the file, as well as locking the file as a secondary signal that
there is a live mesos-slave to new slaves attempting to start.
Summary: Add a --pidfile option to master and agent binaries. (was:
mesos-slave and mesos-master should have a --pidfile option)
Expanded the description to capture more of the suggested implementation and
how monit-like tooling and signaling tooling will leverage this.
> Add a --pidfile option to master and agent binaries.
> ----------------------------------------------------
>
> Key: MESOS-1648
> URL: https://issues.apache.org/jira/browse/MESOS-1648
> Project: Mesos
> Issue Type: Improvement
> Components: master, slave
> Reporter: Tobias Weingartner
> Assignee: Greg Mann
> Labels: newbie, twitter
>
> Right now we use a number of wrapper scripts to try and keep up a
> {{/var/run/mesos/mesos-slave.pid}} in order to be able to monitor the
> process. This has proven to be somewhat fragile due to the lack of locking
> and the possibility of races and stale data.
> By adding a {{--pidfile}}, we can obtain a lock on the file to prevent
> multiple binaries from starting, and to enable the tooling to validate that
> the lock is held before doing any signaling. We can also do a best effort
> unlink in the signal handler upon termination:
> {code}
> // Get exclusive access to the file.
> fd = open(O_CREAT ...)
> flock(fd, LOCK_EX)
> if not locked, abort
> ftruncate(fd, 0)
> // Write the pid.
> write(fd, "<pid>")
> // Inside signal handler..
> unlink(pidfile)
> {code}
> Digging around, looks like the open, ftruncate, write pattern is pretty
> common:
> http://man7.org/tlpi/code/online/diff/filelock/create_pid_file.c.html
> The tooling around it could that the file is locked by the pid inside it,
> before taking any action (like signaling):
> *Case 1*: If the file does not exist or is not locked, then assume nothing is
> running. It's possible for something to be running and about to grab the
> lock, but we'll eventually read it correctly and converge on a single
> instance started correctly.
> *Case 2*: If the file is locked, and the pid doesn't match, then assume it is
> running but not as the pid in the file (.. yet). Treat this the same as (1),
> assume it's not running, and the next attempts to start will eventually
> converge on a single instance running.
> *Case 3*: If the file is locked, and the pid matches the locker process, then
> assume it is running as that pid. Note that it's still possible that in
> between matching the pid and taking an action (e.g. kill), the pid may become
> stale, but the recycling pattern of pids makes it unlikely to be re-used
> unless there is a large delay.
> It seems like some tools already do this signal wrapping (note the comment
> about fcntl and note the race from (3) in the BUGS section):
> http://manpages.ubuntu.com/manpages/natty/man8/ovs-kill.8.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)