Here is what the curl request looks like:
curl -i -f -u "$DEPLOY_USER:$DEPLOY_PASS" -X POST https://deploy.company.com
-d '{"app_name":"api","app_env":"'"$ENV"'"}'
curl: (22) The requested URL returned error: 504
and the webhook logs:
[DEPRECATION WARNING]: Instead of sudo/sudo_user, use become/become_user and
make sure become_method is 'sudo' (default). This feature will be removed
in a
future release. Deprecation warnings can be disabled by setting
deprecation_warnings=False in ansible.cfg.
PLAY [Configure instance(s)]
***************************************************
TASK [setup]
*******************************************************************
ok: [52.4.115.46]
TASK [api : create api non-prod test container] ************************
changed: [52.4.115.46]
TASK [api : pause] *********************************************************
Pausing for 5 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [52.4.115.46]
TASK [api : run test] ******************************************************
ok: [52.4.115.46]
TASK [api : create api non-prod containers] ****************************
changed: [52.4.115.46]
RUNNING HANDLER [api : stop test container] ********************************
changed: [52.4.115.46]
RUNNING HANDLER [api : remove test container] ******************************
changed: [52.4.115.46]
RUNNING HANDLER [api : DEL redis keys] *************************************
changed: [52.4.115.46]
PLAY RECAP
*********************************************************************
52.4.115.46 : ok=26 changed=5 unreachable=0 failed=0
INFO:waitress:2016-05-01 19:44:48.458123
Running *Ansible 2.0.0.2*
On Sunday, May 1, 2016 at 2:18:55 PM UTC-5, Marcus Morris wrote:
>
> Hi.
>
> I have a small webhook app that is kicked of by curl and runs
> ansible-playbook. I have noticed some weirdness where my webhook would show
> the Ansible run as completed successfully, but curl would return 500 or
> 504. I have tried my best to debug, but the furthest I could get is
> isolating the check_call running ansible-playbook as the problem.
>
> My coworker noticed defunct ssh processes on the same machine and they
> seem to coincide with ansible-playbook runs. They only go away after
> restarting the webhook app. Since I can't seem to figure out why curl is
> failing with 500/504, I thought I'd trying and solve the defunct ssh
> problem in the hopes it is related.
>
> Here is the ansible-playbook call:
>
> json_data = request.get_json(force=True)
>
> try:
> app_name = json_data['app_name']
> app_env = json_data['app_env']
> except KeyError:
> return 'Please specify app_name and app_envs', 400
>
> play = '%s_%s' % (app_name, app_env)
> inventory = 'inventory/%s' % play
> tag = 'deploy'
>
> try:
> check_call(["ansible-playbook", "-i", inventory,
> "infra_{}.yml".format(play), "--tags", tag],
> cwd=workspace)
> except Exception as e:
> logger.exception(e)
> logger.info(datetime.now())
> return 'Failure. See logs for error.', 500
> else:
> logger.info(datetime.now())
> return 'Success!', 200
>
> It seems that some playbooks result in defunct ssh processes and some
> don't. I can't seem to figure out a difference between the playbooks that
> involve ssh as they are all just running docker containers. This is what I
> find immediately after a run that succeeds, but curl fails with 500/504:
>
> $ ps -ef | grep ssh
>
> root 13925 1 0 Mar07 ? 00:00:14 /usr/sbin/sshd -D
> root 17119 13925 0 18:58 ? 00:00:00 sshd: mmorris [priv]
> mmorris 17163 17119 0 18:58 ? 00:00:00 sshd: mmorris@pts/1
> root 17345 17243 0 19:07 ? 00:00:00 [ssh] <defunct>
> root 17346 17243 0 19:07 ? 00:00:00 ssh: /root/.ansible/cp/
> ansible-ssh-52.4.115.46-22-root [mux]
> root 17478 13925 0 19:09 ? 00:00:00 sshd: mmorris [priv]
> mmorris 17521 17478 0 19:09 ? 00:00:00 sshd: mmorris@pts/3
>
> And then after less than 30 seconds, the ansible related process also
> turns defunct:
>
> $ ps -ef | grep ssh
> root 13925 1 0 Mar07 ? 00:00:14 /usr/sbin/sshd -D
> root 17119 13925 0 18:58 ? 00:00:00 sshd: mmorris [priv]
> mmorris 17163 17119 0 18:58 ? 00:00:00 sshd: mmorris@pts/1
> root 17345 17243 0 19:07 ? 00:00:00 [ssh] <defunct>
> root 17346 17243 0 19:07 ? 00:00:00 [ssh] <defunct>
> root 17478 13925 0 19:09 ? 00:00:00 sshd: mmorris [priv]
> mmorris 17521 17478 0 19:09 ? 00:00:00 sshd: mmorris@pts/3
>
> This has been causing me headache for a while now as I have CI/CD runs
> failing even though the deploy itself with Ansible is successful. Any
> information or advice for figuring this out would be VERY much appreciated!
>
> Maybe there is something I can do instead of just check_call so that
> whatever is going on with the ssh processes won't effect the exit code
> passed to the app?
>
>
--
You received this message because you are subscribed to the Google Groups
"Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/ansible-project/0ddd2ede-c969-48ec-9eb3-b6c82f127655%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.