A quick question, if i am allowed to. i am 4 nodes and if one of them is
offline, it appears that the entire job fails. Is this by design? Should i
put in timeout values to move on?
I thought the commands are executed parallely
Deploying the ABC application via Capistrano
ffi-yajl/json_gem is deprecated, these monkeypatches will be dropped shortly
* executing `staging'
ffi-yajl/json_gem is deprecated, these monkeypatches will be dropped shortly
triggering start callbacks for `deploy'
* executing `multistage:ensure'
* executing `deploy'
triggering before callbacks for `deploy'
* executing `deploy:stop_app'
* executing multiple commands in parallel
servers: ["node1", "node2", "node3", "node4"]
connection failed for: node2 (Errno::ETIMEDOUT: Connection timed out -
connect(2) for "node2" port 22)
Build step 'Execute shell' marked build as failure
Finished: FAILURE
On Wednesday, May 20, 2015 at 4:43:25 PM UTC-7, niristotle okram wrote:
>
> hey Lee,
> just to put an end on this thread:
> the cause was not the cap2, per se. It was due to the process sprawn by
> cap being killed when the cap exits its task. We had to modify the init
> script to fix this issue. The 'nohup' is used to keep the daemon in the bg
> even after the cap exist the ssh session. Below is the modified portion of
> the script.
>
>
> case "$1" in
> start)
> #checking to see if the process is already running, if it is, display
> a message and exit
> if [ -n "`ps aux |grep /opt/mount1/oss/current/src/main.py | grep -v
> grep`" ]; then
> echo "Service is already running, try stopping it first with
> 'service oss stop'"
> exit
> fi
>
> printf "%-50s" "Starting $DAEMON_NAME..."
> cd $DIR
> [ -d $LOGPATH ] || mkdir $LOGPATH
> [ -f $LOGFILE ] || su $DAEMON_USER -c 'touch $LOGFILE'
> nohup $PYTHON $DAEMON $DAEMON_OPTS > $LOGFILE 2>&1 &
> echo $! > $PIDFILE
> sleep 5
> ;;
>
> thanks
> Okram
>
>
>
>
>
> On Tuesday, May 19, 2015 at 1:28:32 PM UTC-7, Lee Hambley wrote:
>>
>> It almost certainly has something to do with your process failing to
>> daemonize properly. Ruby processes struggle too, check
>> http://stackoverflow.com/a/688448/119669 it has to do with parts of the
>> process that is forked (your python daemon in this case) still inheriting
>> some resources which are attached to the Capistrano session.
>>
>> This unfortunately falls outside stuff I can help you with reasonably or
>> remotely. You *might* have some success learning enough strace to see your
>> process, and how it behaves when Cap disconnects.
>>
>> Lee Hambley
>> http://lee.hambley.name/
>> +49 (0) 170 298 5667
>>
>> On 19 May 2015 at 22:23, niristotle okram <[email protected]> wrote:
>>
>>> I apologize, i made a mistake in reporting the issue. :(
>>>
>>> The original report about " xyz.pid doesn't exist, when it actually
>>> does." is not an ERROR. Its working as design. The service was manually
>>> stopped before trying to stop again. And that is why it threw that error.
>>> The issue here is, When the capistrano starts the service as a part of the
>>> task "deploy:restart_app" after the deploy, the service doesn't start
>>> up fine. Checking the status "service xyz status" after the deploy returns
>>> "Process dead but pid exists".
>>>
>>> This behaviors of service start failure is only seen post the cap deploy
>>> and cannot be reproduced manually.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tuesday, May 19, 2015 at 12:10:50 AM UTC-7, Lee Hambley wrote:
>>>>
>>>> Sorry, I can't see anything wrong with it. :-\
>>>>
>>>> Lee Hambley
>>>> http://lee.hambley.name/
>>>> +49 (0) 170 298 5667
>>>>
>>>> On 19 May 2015 at 02:26, niristotle okram <[email protected]> wrote:
>>>>
>>>>> hi Lee,
>>>>>
>>>>> here is the full /etc/init.d/ script http://pastebin.com/02G5tpgH
>>>>>
>>>>>
>>>>> So, i placed a task to stop the app, then deploy & then start the
>>>>> service/app. The task have the below commands to check
>>>>>
>>>>> whoami -----> ** [out :: server4] root
>>>>> ll /var/run/ ---> this shows the xyz.pid file
>>>>>
>>>>> So i see the service stops just fine.
>>>>>
>>>>> ** [out :: server1] Stopping
>>>>>
>>>>> ** [out :: server1] Ok
>>>>>
>>>>>
>>>>>
>>>>> And the service also starts just fins
>>>>>
>>>>>
>>>>> ** [out :: server1] Starting xyz...
>>>>>
>>>>> ** [out :: server1] Ok
>>>>>
>>>>>
>>>>>
>>>>> But on checking manually -- "service xyz status", i get this " Process
>>>>> dead but pidfile exist". I can stop and start the service just fine
>>>>> manually.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Monday, May 18, 2015 at 12:24:50 PM UTC-7, Lee Hambley wrote:
>>>>>>
>>>>>> You wrote that the /etc/init.d/xyz is done by "sudo" so the deploy
>>>>>> user apparently has access to password-less sudo (at least for some
>>>>>> actions), it would appear that the file is not visible to `root`. Which
>>>>>> I
>>>>>> don't believe or expect.
>>>>>>
>>>>>> You included a part of the /etc/init.d/xyz, but didn't include the
>>>>>> full thing for some reason, so I can't see what the value of $PIDFILE
>>>>>> should be in this case (*please*, adhere to the list guidelines and
>>>>>> paste long files in an external service, and link them), nor state where
>>>>>> you got the template.
>>>>>>
>>>>>> I also don't understand the logic behind setting shell: 'bash' on
>>>>>> the run() lines that interface with the init script.
>>>>>>
>>>>>> Your task:
>>>>>>
>>>>>> task :stop_app, :roles => :web do
>>>>>>
>>>>>> run "sudo /etc/init.d/xyz stop", :shell => :bash
>>>>>>
>>>>>> end
>>>>>>
>>>>>>
>>>>>> I might suggest you extend that (or make a similar one,
>>>>>> "debug_initd_stuff") that does something like:
>>>>>>
>>>>>> task :debug_initd_stuff, :roles => :web do
>>>>>>
>>>>>> run "sudo whoam"
>>>>>>
>>>>>> run "sudo ls -l /etc/init.d"
>>>>>>
>>>>>> run "sudo ls -l /var/run"
>>>>>>
>>>>>> end
>>>>>>
>>>>>>
>>>>>> You might also want to run the init.d script through shellcheck.net,
>>>>>> since there are quite a few violations and bad practices already in
>>>>>> sight
>>>>>> there, shellcheck might help you iron some of them out. (That said,
>>>>>> honestly the problem is probably something much simpler.)
>>>>>>
>>>>>>
>>>>>> Lee Hambley
>>>>>> http://lee.hambley.name/
>>>>>> +49 (0) 170 298 5667
>>>>>>
>>>>>> On 18 May 2015 at 21:14, niristotle okram <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Versions:
>>>>>>>
>>>>>>> - Ruby 2.1.1
>>>>>>> - Capistrano 2
>>>>>>> - Rake / Rails / etc
>>>>>>>
>>>>>>> Platform:
>>>>>>>
>>>>>>> - Working on.... RHEL 6
>>>>>>> - Deploying to... RHEL 6
>>>>>>>
>>>>>>>
>>>>>>> A part of the Deploy.rb:
>>>>>>>
>>>>>>> #before "deploy", "deploy:stop_app"
>>>>>>>
>>>>>>> #after "deploy", "deploy:start_app"
>>>>>>>
>>>>>>> after "deploy", "deploy:restart_app"
>>>>>>>
>>>>>>> namespace :deploy do
>>>>>>>
>>>>>>> task :update_code, :roles => :web, :except => { :no_release =>
>>>>>>> true } do
>>>>>>>
>>>>>>> on_rollback { puts "DO NOT WANT TO ROLL BACK?" }
>>>>>>>
>>>>>>> strategy.deploy!
>>>>>>>
>>>>>>> finalize_update
>>>>>>>
>>>>>>> end
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> task :stop_app, :roles => :web do
>>>>>>>
>>>>>>> run "sudo /etc/init.d/xyz stop", :shell => :bash
>>>>>>>
>>>>>>> end
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> task :start_app, :roles => :web do
>>>>>>>
>>>>>>> run "sudo /etc/init.d/xyz start", :shell => :bash
>>>>>>>
>>>>>>> end
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> task :restart_app, :roles => :web do
>>>>>>>
>>>>>>> run "sudo /etc/init.d/xyz restart", :shell => :bash
>>>>>>>
>>>>>>> end
>>>>>>>
>>>>>>> end
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I have the parameter in the 'deploy.rb',
>>>>>>>
>>>>>>> *set :user, 'a_user'*
>>>>>>>
>>>>>>> Q: Which user performs the task to restart the service (xyz) after
>>>>>>> the deployment of app (xyz)? I am getting the errors saying the xyz.pid
>>>>>>> doesn't exist, when it actually does. This is a part of the shell
>>>>>>> script
>>>>>>> while stopping the service.
>>>>>>>
>>>>>>>
>>>>>>> A part of the /etc/init.d/xyz
>>>>>>>
>>>>>>> case "$1" in
>>>>>>> start)
>>>>>>> printf "%-50s" "Starting $DAEMON_NAME..."
>>>>>>> cd $DIR
>>>>>>> [ -d $LOGPATH ] || mkdir $LOGPATH
>>>>>>> [ -f $LOGFILE ] || su $DAEMON_USER -c 'touch $LOGFILE'
>>>>>>> PID=`$PYTHON $DAEMON $DAEMON_OPTS > $LOGFILE 2>&1 & echo $!`
>>>>>>> #echo "Saving PID" $PID " to " $PIDFILE
>>>>>>> if [ -z $PID ]; then
>>>>>>> printf "%s\n" "Fail"
>>>>>>> else
>>>>>>> echo $PID > $PIDFILE
>>>>>>> printf "%s\n" "Ok"
>>>>>>> fi
>>>>>>> ;;
>>>>>>> status)
>>>>>>> printf "%-50s" "Checking $DAEMON_NAME..."
>>>>>>> if [ -f $PIDFILE ]; then
>>>>>>> PID=`cat $PIDFILE`
>>>>>>> if [ -z "`ps axf | grep ${PID} | grep -v grep`" ]; then
>>>>>>> printf "%s\n" "Process dead but pidfile exists"
>>>>>>> else
>>>>>>> echo "Running"
>>>>>>> fi
>>>>>>> else
>>>>>>> printf "%s\n" "Service not running"
>>>>>>> fi
>>>>>>> ;;
>>>>>>> stop)
>>>>>>> printf "%-50s" "Stopping $DAEMONNAME"
>>>>>>> PID=`cat $PIDFILE`
>>>>>>> cd $DIR
>>>>>>> if [ -f $PIDFILE ]; then
>>>>>>> kill -HUP $PID
>>>>>>> printf "%s\n" "Ok"
>>>>>>> rm -f $PIDFILE
>>>>>>> else
>>>>>>> printf "%s\n" "pidfile not found"
>>>>>>> fi
>>>>>>> ;;
>>>>>>>
>>>>>>> restart)
>>>>>>> $0 stop
>>>>>>> $0 start
>>>>>>> ;;
>>>>>>>
>>>>>>> *)
>>>>>>> echo "Usage: $0 {status|start|stop|restart}"
>>>>>>> exit 1
>>>>>>> esac
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Capistrano log
>>>>>>>
>>>>>>> * executing `deploy:restart_app'
>>>>>>>
>>>>>>> * executing multiple commands in parallel
>>>>>>>
>>>>>>> -> "else" :: "sudo /etc/init.d/xyz restart"
>>>>>>>
>>>>>>> -> "else" :: "sudo /etc/init.d/xyz restart"
>>>>>>>
>>>>>>> -> "else" :: "sudo /etc/init.d/xyz restart"
>>>>>>>
>>>>>>> -> "else" :: "sudo /etc/init.d/xyz restart"
>>>>>>>
>>>>>>> servers: ["server1", "server2", "server3", "server4"]
>>>>>>>
>>>>>>> [server1] executing command
>>>>>>>
>>>>>>> [server2] executing command
>>>>>>>
>>>>>>> [server3] executing command
>>>>>>>
>>>>>>> [server4] executing command
>>>>>>>
>>>>>>> ** [out :: server1] Stopping
>>>>>>>
>>>>>>> ** [out :: server1] cat: /var/run/xyz.pid: No such file or directory
>>>>>>>
>>>>>>> ** [out :: server1] pidfile not found
>>>>>>>
>>>>>>> ** [out :: server1] Starting xyz...
>>>>>>>
>>>>>>> ** [out :: server2] Stopping
>>>>>>>
>>>>>>> ** [out :: server2] cat: /var/run/xyz.pid: No such file or directory
>>>>>>>
>>>>>>> ** [out :: server2] pidfile not found
>>>>>>>
>>>>>>> ** [out :: server2] Starting xyz...
>>>>>>>
>>>>>>> ** [out :: server2] Ok
>>>>>>>
>>>>>>> ** [out :: server1] Ok
>>>>>>>
>>>>>>> ** [out :: server3] Stopping
>>>>>>>
>>>>>>> ** [out :: server3] cat: /var/run/xyz.pid: No such file or directory
>>>>>>>
>>>>>>> ** [out :: server3] pidfile not found
>>>>>>>
>>>>>>> ** [out :: server4] Stopping
>>>>>>>
>>>>>>> ** [out :: server4] Ok
>>>>>>>
>>>>>>> ** [out :: server3] Starting xyz...
>>>>>>>
>>>>>>> ** [out :: server3] Ok
>>>>>>>
>>>>>>> ** [out :: server4] Starting xyz...
>>>>>>>
>>>>>>> ** [out :: server4] Ok
>>>>>>>
>>>>>>> command finished in 659ms
>>>>>>>
>>>>>>> Finished: SUCCESS
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I can cat the file as the deploy user just fine.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "Capistrano" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>> To view this discussion on the web, visit
>>>>>>> https://groups.google.com/d/msgid/capistrano/56a2a2dd-fd26-4b14-a2da-0d7af37f8354%40googlegroups.com
>>>>>>>
>>>>>>> <https://groups.google.com/d/msgid/capistrano/56a2a2dd-fd26-4b14-a2da-0d7af37f8354%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Capistrano" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web, visit
>>>>> https://groups.google.com/d/msgid/capistrano/d47d5020-1915-4194-be85-b72e157b0c23%40googlegroups.com
>>>>>
>>>>> <https://groups.google.com/d/msgid/capistrano/d47d5020-1915-4194-be85-b72e157b0c23%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Capistrano" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web, visit
>>> https://groups.google.com/d/msgid/capistrano/bdf4409b-4f59-42f2-be66-2bb4f895dbfe%40googlegroups.com
>>>
>>> <https://groups.google.com/d/msgid/capistrano/bdf4409b-4f59-42f2-be66-2bb4f895dbfe%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
--
You received this message because you are subscribed to the Google Groups
"Capistrano" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web, visit
https://groups.google.com/d/msgid/capistrano/09fc2a6b-f95c-435e-bc72-1f51e9964200%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.