A quick question, if i am allowed to. i am 4 nodes and if one of them is offline, it appears that the entire job fails. Is this by design? Should i put in timeout values to move on? I thought the commands are executed parallely
Deploying the ABC application via Capistrano ffi-yajl/json_gem is deprecated, these monkeypatches will be dropped shortly * executing `staging' ffi-yajl/json_gem is deprecated, these monkeypatches will be dropped shortly triggering start callbacks for `deploy' * executing `multistage:ensure' * executing `deploy' triggering before callbacks for `deploy' * executing `deploy:stop_app' * executing multiple commands in parallel servers: ["node1", "node2", "node3", "node4"] connection failed for: node2 (Errno::ETIMEDOUT: Connection timed out - connect(2) for "node2" port 22) Build step 'Execute shell' marked build as failure Finished: FAILURE On Wednesday, May 20, 2015 at 4:43:25 PM UTC-7, niristotle okram wrote: > > hey Lee, > just to put an end on this thread: > the cause was not the cap2, per se. It was due to the process sprawn by > cap being killed when the cap exits its task. We had to modify the init > script to fix this issue. The 'nohup' is used to keep the daemon in the bg > even after the cap exist the ssh session. Below is the modified portion of > the script. > > > case "$1" in > start) > #checking to see if the process is already running, if it is, display > a message and exit > if [ -n "`ps aux |grep /opt/mount1/oss/current/src/main.py | grep -v > grep`" ]; then > echo "Service is already running, try stopping it first with > 'service oss stop'" > exit > fi > > printf "%-50s" "Starting $DAEMON_NAME..." > cd $DIR > [ -d $LOGPATH ] || mkdir $LOGPATH > [ -f $LOGFILE ] || su $DAEMON_USER -c 'touch $LOGFILE' > nohup $PYTHON $DAEMON $DAEMON_OPTS > $LOGFILE 2>&1 & > echo $! > $PIDFILE > sleep 5 > ;; > > thanks > Okram > > > > > > On Tuesday, May 19, 2015 at 1:28:32 PM UTC-7, Lee Hambley wrote: >> >> It almost certainly has something to do with your process failing to >> daemonize properly. Ruby processes struggle too, check >> http://stackoverflow.com/a/688448/119669 it has to do with parts of the >> process that is forked (your python daemon in this case) still inheriting >> some resources which are attached to the Capistrano session. >> >> This unfortunately falls outside stuff I can help you with reasonably or >> remotely. You *might* have some success learning enough strace to see your >> process, and how it behaves when Cap disconnects. >> >> Lee Hambley >> http://lee.hambley.name/ >> +49 (0) 170 298 5667 >> >> On 19 May 2015 at 22:23, niristotle okram <nirish...@gmail.com> wrote: >> >>> I apologize, i made a mistake in reporting the issue. :( >>> >>> The original report about " xyz.pid doesn't exist, when it actually >>> does." is not an ERROR. Its working as design. The service was manually >>> stopped before trying to stop again. And that is why it threw that error. >>> The issue here is, When the capistrano starts the service as a part of the >>> task "deploy:restart_app" after the deploy, the service doesn't start >>> up fine. Checking the status "service xyz status" after the deploy returns >>> "Process dead but pid exists". >>> >>> This behaviors of service start failure is only seen post the cap deploy >>> and cannot be reproduced manually. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Tuesday, May 19, 2015 at 12:10:50 AM UTC-7, Lee Hambley wrote: >>>> >>>> Sorry, I can't see anything wrong with it. :-\ >>>> >>>> Lee Hambley >>>> http://lee.hambley.name/ >>>> +49 (0) 170 298 5667 >>>> >>>> On 19 May 2015 at 02:26, niristotle okram <nirish...@gmail.com> wrote: >>>> >>>>> hi Lee, >>>>> >>>>> here is the full /etc/init.d/ script http://pastebin.com/02G5tpgH >>>>> >>>>> >>>>> So, i placed a task to stop the app, then deploy & then start the >>>>> service/app. The task have the below commands to check >>>>> >>>>> whoami -----> ** [out :: server4] root >>>>> ll /var/run/ ---> this shows the xyz.pid file >>>>> >>>>> So i see the service stops just fine. >>>>> >>>>> ** [out :: server1] Stopping >>>>> >>>>> ** [out :: server1] Ok >>>>> >>>>> >>>>> >>>>> And the service also starts just fins >>>>> >>>>> >>>>> ** [out :: server1] Starting xyz... >>>>> >>>>> ** [out :: server1] Ok >>>>> >>>>> >>>>> >>>>> But on checking manually -- "service xyz status", i get this " Process >>>>> dead but pidfile exist". I can stop and start the service just fine >>>>> manually. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Monday, May 18, 2015 at 12:24:50 PM UTC-7, Lee Hambley wrote: >>>>>> >>>>>> You wrote that the /etc/init.d/xyz is done by "sudo" so the deploy >>>>>> user apparently has access to password-less sudo (at least for some >>>>>> actions), it would appear that the file is not visible to `root`. Which >>>>>> I >>>>>> don't believe or expect. >>>>>> >>>>>> You included a part of the /etc/init.d/xyz, but didn't include the >>>>>> full thing for some reason, so I can't see what the value of $PIDFILE >>>>>> should be in this case (*please*, adhere to the list guidelines and >>>>>> paste long files in an external service, and link them), nor state where >>>>>> you got the template. >>>>>> >>>>>> I also don't understand the logic behind setting shell: 'bash' on >>>>>> the run() lines that interface with the init script. >>>>>> >>>>>> Your task: >>>>>> >>>>>> task :stop_app, :roles => :web do >>>>>> >>>>>> run "sudo /etc/init.d/xyz stop", :shell => :bash >>>>>> >>>>>> end >>>>>> >>>>>> >>>>>> I might suggest you extend that (or make a similar one, >>>>>> "debug_initd_stuff") that does something like: >>>>>> >>>>>> task :debug_initd_stuff, :roles => :web do >>>>>> >>>>>> run "sudo whoam" >>>>>> >>>>>> run "sudo ls -l /etc/init.d" >>>>>> >>>>>> run "sudo ls -l /var/run" >>>>>> >>>>>> end >>>>>> >>>>>> >>>>>> You might also want to run the init.d script through shellcheck.net, >>>>>> since there are quite a few violations and bad practices already in >>>>>> sight >>>>>> there, shellcheck might help you iron some of them out. (That said, >>>>>> honestly the problem is probably something much simpler.) >>>>>> >>>>>> >>>>>> Lee Hambley >>>>>> http://lee.hambley.name/ >>>>>> +49 (0) 170 298 5667 >>>>>> >>>>>> On 18 May 2015 at 21:14, niristotle okram <nirish...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Versions: >>>>>>> >>>>>>> - Ruby 2.1.1 >>>>>>> - Capistrano 2 >>>>>>> - Rake / Rails / etc >>>>>>> >>>>>>> Platform: >>>>>>> >>>>>>> - Working on.... RHEL 6 >>>>>>> - Deploying to... RHEL 6 >>>>>>> >>>>>>> >>>>>>> A part of the Deploy.rb: >>>>>>> >>>>>>> #before "deploy", "deploy:stop_app" >>>>>>> >>>>>>> #after "deploy", "deploy:start_app" >>>>>>> >>>>>>> after "deploy", "deploy:restart_app" >>>>>>> >>>>>>> namespace :deploy do >>>>>>> >>>>>>> task :update_code, :roles => :web, :except => { :no_release => >>>>>>> true } do >>>>>>> >>>>>>> on_rollback { puts "DO NOT WANT TO ROLL BACK?" } >>>>>>> >>>>>>> strategy.deploy! >>>>>>> >>>>>>> finalize_update >>>>>>> >>>>>>> end >>>>>>> >>>>>>> >>>>>>> >>>>>>> task :stop_app, :roles => :web do >>>>>>> >>>>>>> run "sudo /etc/init.d/xyz stop", :shell => :bash >>>>>>> >>>>>>> end >>>>>>> >>>>>>> >>>>>>> >>>>>>> task :start_app, :roles => :web do >>>>>>> >>>>>>> run "sudo /etc/init.d/xyz start", :shell => :bash >>>>>>> >>>>>>> end >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> task :restart_app, :roles => :web do >>>>>>> >>>>>>> run "sudo /etc/init.d/xyz restart", :shell => :bash >>>>>>> >>>>>>> end >>>>>>> >>>>>>> end >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> I have the parameter in the 'deploy.rb', >>>>>>> >>>>>>> *set :user, 'a_user'* >>>>>>> >>>>>>> Q: Which user performs the task to restart the service (xyz) after >>>>>>> the deployment of app (xyz)? I am getting the errors saying the xyz.pid >>>>>>> doesn't exist, when it actually does. This is a part of the shell >>>>>>> script >>>>>>> while stopping the service. >>>>>>> >>>>>>> >>>>>>> A part of the /etc/init.d/xyz >>>>>>> >>>>>>> case "$1" in >>>>>>> start) >>>>>>> printf "%-50s" "Starting $DAEMON_NAME..." >>>>>>> cd $DIR >>>>>>> [ -d $LOGPATH ] || mkdir $LOGPATH >>>>>>> [ -f $LOGFILE ] || su $DAEMON_USER -c 'touch $LOGFILE' >>>>>>> PID=`$PYTHON $DAEMON $DAEMON_OPTS > $LOGFILE 2>&1 & echo $!` >>>>>>> #echo "Saving PID" $PID " to " $PIDFILE >>>>>>> if [ -z $PID ]; then >>>>>>> printf "%s\n" "Fail" >>>>>>> else >>>>>>> echo $PID > $PIDFILE >>>>>>> printf "%s\n" "Ok" >>>>>>> fi >>>>>>> ;; >>>>>>> status) >>>>>>> printf "%-50s" "Checking $DAEMON_NAME..." >>>>>>> if [ -f $PIDFILE ]; then >>>>>>> PID=`cat $PIDFILE` >>>>>>> if [ -z "`ps axf | grep ${PID} | grep -v grep`" ]; then >>>>>>> printf "%s\n" "Process dead but pidfile exists" >>>>>>> else >>>>>>> echo "Running" >>>>>>> fi >>>>>>> else >>>>>>> printf "%s\n" "Service not running" >>>>>>> fi >>>>>>> ;; >>>>>>> stop) >>>>>>> printf "%-50s" "Stopping $DAEMONNAME" >>>>>>> PID=`cat $PIDFILE` >>>>>>> cd $DIR >>>>>>> if [ -f $PIDFILE ]; then >>>>>>> kill -HUP $PID >>>>>>> printf "%s\n" "Ok" >>>>>>> rm -f $PIDFILE >>>>>>> else >>>>>>> printf "%s\n" "pidfile not found" >>>>>>> fi >>>>>>> ;; >>>>>>> >>>>>>> restart) >>>>>>> $0 stop >>>>>>> $0 start >>>>>>> ;; >>>>>>> >>>>>>> *) >>>>>>> echo "Usage: $0 {status|start|stop|restart}" >>>>>>> exit 1 >>>>>>> esac >>>>>>> >>>>>>> >>>>>>> >>>>>>> Capistrano log >>>>>>> >>>>>>> * executing `deploy:restart_app' >>>>>>> >>>>>>> * executing multiple commands in parallel >>>>>>> >>>>>>> -> "else" :: "sudo /etc/init.d/xyz restart" >>>>>>> >>>>>>> -> "else" :: "sudo /etc/init.d/xyz restart" >>>>>>> >>>>>>> -> "else" :: "sudo /etc/init.d/xyz restart" >>>>>>> >>>>>>> -> "else" :: "sudo /etc/init.d/xyz restart" >>>>>>> >>>>>>> servers: ["server1", "server2", "server3", "server4"] >>>>>>> >>>>>>> [server1] executing command >>>>>>> >>>>>>> [server2] executing command >>>>>>> >>>>>>> [server3] executing command >>>>>>> >>>>>>> [server4] executing command >>>>>>> >>>>>>> ** [out :: server1] Stopping >>>>>>> >>>>>>> ** [out :: server1] cat: /var/run/xyz.pid: No such file or directory >>>>>>> >>>>>>> ** [out :: server1] pidfile not found >>>>>>> >>>>>>> ** [out :: server1] Starting xyz... >>>>>>> >>>>>>> ** [out :: server2] Stopping >>>>>>> >>>>>>> ** [out :: server2] cat: /var/run/xyz.pid: No such file or directory >>>>>>> >>>>>>> ** [out :: server2] pidfile not found >>>>>>> >>>>>>> ** [out :: server2] Starting xyz... >>>>>>> >>>>>>> ** [out :: server2] Ok >>>>>>> >>>>>>> ** [out :: server1] Ok >>>>>>> >>>>>>> ** [out :: server3] Stopping >>>>>>> >>>>>>> ** [out :: server3] cat: /var/run/xyz.pid: No such file or directory >>>>>>> >>>>>>> ** [out :: server3] pidfile not found >>>>>>> >>>>>>> ** [out :: server4] Stopping >>>>>>> >>>>>>> ** [out :: server4] Ok >>>>>>> >>>>>>> ** [out :: server3] Starting xyz... >>>>>>> >>>>>>> ** [out :: server3] Ok >>>>>>> >>>>>>> ** [out :: server4] Starting xyz... >>>>>>> >>>>>>> ** [out :: server4] Ok >>>>>>> >>>>>>> command finished in 659ms >>>>>>> >>>>>>> Finished: SUCCESS >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> I can cat the file as the deploy user just fine. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Capistrano" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to capistrano+...@googlegroups.com. >>>>>>> To view this discussion on the web, visit >>>>>>> https://groups.google.com/d/msgid/capistrano/56a2a2dd-fd26-4b14-a2da-0d7af37f8354%40googlegroups.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/capistrano/56a2a2dd-fd26-4b14-a2da-0d7af37f8354%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Capistrano" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to capistrano+...@googlegroups.com. >>>>> To view this discussion on the web, visit >>>>> https://groups.google.com/d/msgid/capistrano/d47d5020-1915-4194-be85-b72e157b0c23%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/capistrano/d47d5020-1915-4194-be85-b72e157b0c23%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Capistrano" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to capistrano+...@googlegroups.com. >>> To view this discussion on the web, visit >>> https://groups.google.com/d/msgid/capistrano/bdf4409b-4f59-42f2-be66-2bb4f895dbfe%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/capistrano/bdf4409b-4f59-42f2-be66-2bb4f895dbfe%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "Capistrano" group. To unsubscribe from this group and stop receiving emails from it, send an email to capistrano+unsubscr...@googlegroups.com. To view this discussion on the web, visit https://groups.google.com/d/msgid/capistrano/09fc2a6b-f95c-435e-bc72-1f51e9964200%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.