The short version;

I'm running tests against Fedora 22 Alpha Atomic host and Server, and am 
seeing various connection failures depending on ssh_connection settings. 
 They all look like something reported against openssh-5.3 with the 
ControlPersist backport.  I played around with various values and found a 
workaround, remove the ControlPersist value:

[ssh_connection]
ssh_args = -o ControlMaster=auto


in ~/.ansible.cfg allows everything to work.  The question is why?

The long version:

Looking up the connection issues (SFTP failures, plain connection resets, 
etc) I came across this post from last year 
https://groups.google.com/forum/#!msg/ansible-project/QUdxNK1zEH0/rQKnO827FUgJ 
which had similar looking failures and lead to this RHT Bugzilla 
https://bugzilla.redhat.com/show_bug.cgi?id=1160487.

I went through the various options mentioned in the thread in a local 
ansible.cfg, disabling pipeline, changing to scp, clearing ssh_args 
entirely.  The last option worked, so I dug further.

The ansible version is:

[fedora@atomic-master ~]$ rpm -q ansible
ansible-1.8.4-1.fc22.noarch

OpenSSH versions on each system:

192.168.122.10 | success | rc=0 >>
openssh-6.7p1-11.fc22.x86_64

192.168.122.11 | success | rc=0 >>
openssh-6.7p1-10.fc22.x86_64

No user based ssh settings, all values default from /etc/ansible: (p1-11 
succeeds, p1-10 fails)

<192.168.122.10> PubkeyAuthentication=no ConnectTimeout=10 
GSSAPIAuthentication=no 
ControlPath=/home/fedora/.ansible/cp/ansible-ssh-%h-%p-%r 
StrictHostKeyChecking=no ControlMaster=auto ControlPersist=60s
<192.168.122.10> 
fatal: [192.168.122.10] => failed to transfer file to 
/home/fedora/.ansible/tmp/ansible-tmp-1427209312.33-30225738744475/setup:

Couldn't read packet: Connection reset by peer

<192.168.122.11> PubkeyAuthentication=no ConnectTimeout=10 
GSSAPIAuthentication=no 
ControlPath=/home/fedora/.ansible/cp/ansible-ssh-%h-%p-%r 
StrictHostKeyChecking=no ControlMaster=auto ControlPersist=60s

ok: [192.168.122.11]


User based ~/.ansible.cfg, Pipelining enabled (p1-11 succeeds, p1-10 fails)

[defaults]
host_key_checking = False
[ssh_connection]
#ssh_args = -o ControlMaster=auto
#ssh_args = 
pipelining = True

<192.168.122.10> PubkeyAuthentication=no 'sudo -k && sudo -H -S -p "[sudo 
via ansible, key=irpxmyfkxjqtjkyqbfmyvogzyeygovsm] password: " -u root 
/bin/sh -c '"'"'echo SUDO-SUCCESS-irpxmyfkxjqtjkyqbfmyvogzyeygovsm; 
LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python'"'"'' 
ConnectTimeout=10 GSSAPIAuthentication=no 
ControlPath=/home/fedora/.ansible/cp/ansible-ssh-%h-%p-%r 
StrictHostKeyChecking=no ControlMaster=auto ControlPersist=60s

fatal: [192.168.122.10] => ssh connection error waiting for sudo or su 
password prompt

<192.168.122.11> PubkeyAuthentication=no 'sudo -k && sudo -H -S -p "[sudo 
via ansible, key=hdearqulmxcbkpjxjlgwpyiiapebebju] password: " -u root 
/bin/sh -c '"'"'echo SUDO-SUCCESS-hdearqulmxcbkpjxjlgwpyiiapebebju; 
LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python'"'"'' 
ConnectTimeout=10 GSSAPIAuthentication=no 
ControlPath=/home/fedora/.ansible/cp/ansible-ssh-%h-%p-%r 
StrictHostKeyChecking=no ControlMaster=auto ControlPersist=60s

ok: [192.168.122.11]



User based ~/.ansible.cfg, Pipelining with ControlPersist removed (p1-11 
succeeds, p1-10 fails)

[defaults]
host_key_checking = False
[ssh_connection]
ssh_args = -o ControlMaster=auto
#ssh_args = 
pipelining = True

<192.168.122.10> PubkeyAuthentication=no 'sudo -k && sudo -H -S -p "[sudo 
via ansible, key=gbcczsohsqeiuyzxezohqbikibenkeun] password: " -u root 
/bin/sh -c '"'"'echo SUDO-SUCCESS-gbcczsohsqeiuyzxezohqbikibenkeun; 
LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python'"'"'' 
ConnectTimeout=10 GSSAPIAuthentication=no StrictHostKeyChecking=no 
ControlMaster=auto

fatal: [192.168.122.10] => ssh connection error waiting for sudo or su 
password prompt

<192.168.122.11> PubkeyAuthentication=no 'sudo -k && sudo -H -S -p "[sudo 
via ansible, key=xjcccpqinmchipiubhsmjwdjvgmytbww] password: " -u root 
/bin/sh -c '"'"'echo SUDO-SUCCESS-xjcccpqinmchipiubhsmjwdjvgmytbww; 
LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python'"'"'' 
ConnectTimeout=10 GSSAPIAuthentication=no StrictHostKeyChecking=no 
ControlMaster=auto

ok: [192.168.122.11]


User based ~/.ansible.cfg, Remove ControlPersist and pipelining (p1-11 
succeeds, p1-10 succeeds)

[defaults]
host_key_checking = False
[ssh_connection]
ssh_args = -o ControlMaster=auto
#ssh_args = 
#pipelining = True

<192.168.122.10> PubkeyAuthentication=no 'sudo -k && sudo -H -S -p "[sudo 
via ansible, key=dyqtvojqcsjschscszupwxnithfwjhqs] password: " -u root 
/bin/sh -c '"'"'echo SUDO-SUCCESS-dyqtvojqcsjschscszupwxnithfwjhqs; 
LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python 
/home/fedora/.ansible/tmp/ansible-tmp-1427210773.45-273681836317736/setup; 
rm -rf /home/fedora/.ansible/tmp/ansible-tmp-1427210773.45-273681836317736/ 
>/dev/null 2>&1'"'"'' ConnectTimeout=10 GSSAPIAuthentication=no 
StrictHostKeyChecking=no ControlMaster=auto

ok: [192.168.122.10]


<192.168.122.11> PubkeyAuthentication=no 'sudo -k && sudo -H -S -p "[sudo 
via ansible, key=gixgtyagbeoctcldxprcqrjwuxdhscvr] password: " -u root 
/bin/sh -c '"'"'echo SUDO-SUCCESS-gixgtyagbeoctcldxprcqrjwuxdhscvr; 
LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python 
/home/fedora/.ansible/tmp/ansible-tmp-1427210773.47-126330968681284/setup; 
rm -rf /home/fedora/.ansible/tmp/ansible-tmp-1427210773.47-126330968681284/ 
>/dev/null 2>&1'"'"'' ConnectTimeout=10 GSSAPIAuthentication=no 
StrictHostKeyChecking=no ControlMaster=auto

ok: [192.168.122.11]



Manually testing the ControlPersist values on the command line works as 
expected for both, will time out after 60s

[fedora@atomic-master ansible-atomic]$ ssh -F /dev/null -o 
ControlMaster=auto -o ControlPersist=60s -S Test_Master_Socket 
[email protected] echo Hello
[email protected]'s password: 
Hello

[fedora@atomic-master ansible-atomic]$ ps -fu `whoami` | grep 
"[s]sh.*Test_Master_Socket"
fedora    1015     1  0 11:29 ?        00:00:00 ssh: Test_Master_Socket 
[mux]

[fedora@atomic-master ansible-atomic]$ ssh -F /dev/null -S 
Test_Master_Socket -O check 192.168.122.11
Master running (pid=1015)


I looked at the Koji page for openssh and don't see anything particular to 
ControlPersist in the change log, but I'm not an OpenSSH ControlPersist 
guru.  http://koji.fedoraproject.org/koji/buildinfo?buildID=619696


Bottom line, I'm out of troubleshooting steps, not sure what the impact of 
the workaround is, and I think someone who has more depth should take a 
look.  I'm cc'ing ansible-devel because I wasn't sure what the right forum 
for this sort of issue was.  Hopefully this was clear!

Cheers,
-Matt M


-- 
You received this message because you are subscribed to the Google Groups 
"Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/ansible-project/c345bfa1-021e-4bb0-8c59-978841a240fb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to