[ 
https://issues.apache.org/jira/browse/COUCHDB-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240323#comment-13240323
 ] 

Wendall Cada commented on COUCHDB-1449:
---------------------------------------

See: use-sname-rpc-not-kill.patch

Here is what I figured out while testing. The whole concept of using a PID file 
and kill -1 $PID with erlang is just not going to work consistently.

Here is a way to replicate what happens sometimes when issuing a restart 
(stop/start), and beam hasn't stopped yet.

For example, try: couchdb -b && couchdb -d && couchdb -b
Apache CouchDB has started, time to relax.
Apache CouchDB is not running.
Apache CouchDB has started, time to relax.
$ echo `cat /var/run/couchdb/couchdb.pid`
10229
$ ps -A | grep beam.smp
10193 pts/2    00:00:00 beam.smp

However, adding -sname couchdb to the command options results the second start 
failing silently, but couchdb does stop. A stale pid id is left in the pid file 
from the second start command.

Now if I modified start_couchdb so it actually checks if the process id 
returned from the erl command is running, then wait 2 seconds so the pid file 
can hit the disk. I modified stop_couchdb and eliminated the use of kill -1 and 
wait for the process to actually exit. Now everything works as intended, no 
matter what bizarre scenario is encountered.

So for just pure stupid, I can do this: 
for i in {1..5} ; do couchdb -d; couchdb -b ; done
The last command is a start and sure enough, couchdb is running and has 
restarted completely five times.
Same stupid in reverse:
for i in {1..5} ; do couchdb -b; couchdb -d ; done
CouchDB is stopped.

Now clearly there is going to be an issue with the use of sname and multiple 
couchdb instances up and running, but I think it will be worthwhile to fix. 
Every single resource I read and my own experience with erlang is that using 
kill to shut down is just waiting for problems.

I've temporarily appended the pid to start and stop messages for clarity on 
what's happening.







                
> Couchdb returns stopped status before process exits
> ---------------------------------------------------
>
>                 Key: COUCHDB-1449
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1449
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 1.0.3, 1.1.1, 1.2, 1.3
>         Environment: *NIX
>            Reporter: Wendall Cada
>              Labels: patch
>             Fix For: 1.0.4, 1.2.1, 1.1.2
>
>         Attachments: couchdb-0007-wait-for-couch-stop.patch, 
> couchdb-0007-wait-for-couch-stop.patch
>
>
> When restarting couchdb via init script, couchdb returns success status 
> before the process is exited. When a start is issued before the process ends, 
> couchdb fails to start, but returns success.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to