[
https://issues.apache.org/jira/browse/COUCHDB-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240323#comment-13240323
]
Wendall Cada commented on COUCHDB-1449:
---------------------------------------
See: use-sname-rpc-not-kill.patch
Here is what I figured out while testing. The whole concept of using a PID file
and kill -1 $PID with erlang is just not going to work consistently.
Here is a way to replicate what happens sometimes when issuing a restart
(stop/start), and beam hasn't stopped yet.
For example, try: couchdb -b && couchdb -d && couchdb -b
Apache CouchDB has started, time to relax.
Apache CouchDB is not running.
Apache CouchDB has started, time to relax.
$ echo `cat /var/run/couchdb/couchdb.pid`
10229
$ ps -A | grep beam.smp
10193 pts/2 00:00:00 beam.smp
However, adding -sname couchdb to the command options results the second start
failing silently, but couchdb does stop. A stale pid id is left in the pid file
from the second start command.
Now if I modified start_couchdb so it actually checks if the process id
returned from the erl command is running, then wait 2 seconds so the pid file
can hit the disk. I modified stop_couchdb and eliminated the use of kill -1 and
wait for the process to actually exit. Now everything works as intended, no
matter what bizarre scenario is encountered.
So for just pure stupid, I can do this:
for i in {1..5} ; do couchdb -d; couchdb -b ; done
The last command is a start and sure enough, couchdb is running and has
restarted completely five times.
Same stupid in reverse:
for i in {1..5} ; do couchdb -b; couchdb -d ; done
CouchDB is stopped.
Now clearly there is going to be an issue with the use of sname and multiple
couchdb instances up and running, but I think it will be worthwhile to fix.
Every single resource I read and my own experience with erlang is that using
kill to shut down is just waiting for problems.
I've temporarily appended the pid to start and stop messages for clarity on
what's happening.
> Couchdb returns stopped status before process exits
> ---------------------------------------------------
>
> Key: COUCHDB-1449
> URL: https://issues.apache.org/jira/browse/COUCHDB-1449
> Project: CouchDB
> Issue Type: Bug
> Affects Versions: 1.0.3, 1.1.1, 1.2, 1.3
> Environment: *NIX
> Reporter: Wendall Cada
> Labels: patch
> Fix For: 1.0.4, 1.2.1, 1.1.2
>
> Attachments: couchdb-0007-wait-for-couch-stop.patch,
> couchdb-0007-wait-for-couch-stop.patch
>
>
> When restarting couchdb via init script, couchdb returns success status
> before the process is exited. When a start is issued before the process ends,
> couchdb fails to start, but returns success.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira