Re: [openstack-dev] [QA][Neutron] About paramiko's "SSHException: Error reading SSH protocol banner"

2013-12-27 Thread Nachi Ueno
Hi Salvatore

The metadata server is working well? (or may be timing issue)
I saw similar issue when VM failed to get the certificate from the
metadata server.

Best
Nachi

2013/12/27 Salvatore Orlando :
> Yair,
>
> The 'isolated' mode makes the tempest session more realistic by simulating a
> cloud with multiple networks and ports. For Neutron tests this means
> creating a new set of network resources for each test class. If creating
> distinct network resources for each tenant is the root cause of this
> problem, there is probably some underlying issue in neutron that needs to be
> addressed.
>
> I think there is a requirement to be able to run the test suite with 'full
> tenant isolation' and 'parallel' execution.
>
> On the other hand Toshihiro noted that SSH gets connected, and before
> starting with the "protocol banner" errors, it even says that it tries to
> authenticate, and the authentication fails. I don't know what to think here.
> I only have two hints:
> 1 - The SSH connection attempt starts before the VM and the floating IPs are
> wired by the agent. Paramiko has to wait more than a minute for a response
> to come from the server. Could this cause the library to fail? Can we try to
> ping and then SSH once the ping succeeds as test_network_basic_ops does?
> 2 - Is there a chance that key management might be flaky when running
> parallel tests? I don't think so because otherwise we should see failures
> even with nova-network, and that does not happen.
>
> Salvatore
>
>
> On 27 December 2013 08:14, IWAMOTO Toshihiro  wrote:
>>
>> At Fri, 27 Dec 2013 01:53:59 +0100,
>> Salvatore Orlando wrote:
>> >
>> > [1  ]
>> > [1.1  ]
>> > I put together all the patches which we prepared for making parallel
>> > testing work, and ran a few times 'check experimental' on the gate to
>> > see
>> > whether it worked or not.
>> >
>> > With parallel testing, the only really troubling issue are the scenario
>> > tests which require to access a VM from a floating IP, and the new
>> > patches
>> > we've squashed together in [1] should address this issue. However, the
>> > result is that timeouts are still observed but with a different message
>> > [2].
>> > I'm not really familiar with it, and I've never observed it in local
>> > testing. I wonder if it just happens to be the usual problem about the
>> > target host not being reachable, or if it is something specific to
>> > paramiko.
>> >
>> > Any hint would be appreciated, since from the logs is appears everything
>> > is
>> > wired properly.
>>
>> It seems that a TCP connection has been established but paramiko
>> failed get data from the server in time.  Does increasing paramiko
>> timeout help?
>>
>> > [1] https://review.openstack.org/#/c/57420/
>> > [2]
>> >
>> > http://logs.openstack.org/20/57420/40/experimental/check-tempest-dsvm-neutron-isolated-parallel/a74bdc8/console.html#_2013-12-26_22_51_31_817
>> > [1.2  ]
>> >
>> > [2  ]
>> > ___
>> > OpenStack-dev mailing list
>> > OpenStack-dev@lists.openstack.org
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA][Neutron] About paramiko's "SSHException: Error reading SSH protocol banner"

2013-12-27 Thread Salvatore Orlando
Yair,

The 'isolated' mode makes the tempest session more realistic by simulating
a cloud with multiple networks and ports. For Neutron tests this means
creating a new set of network resources for each test class. If creating
distinct network resources for each tenant is the root cause of this
problem, there is probably some underlying issue in neutron that needs to
be addressed.

I think there is a requirement to be able to run the test suite with 'full
tenant isolation' and 'parallel' execution.

On the other hand Toshihiro noted that SSH gets connected, and before
starting with the "protocol banner" errors, it even says that it tries to
authenticate, and the authentication fails. I don't know what to think
here. I only have two hints:
1 - The SSH connection attempt starts before the VM and the floating IPs
are wired by the agent. Paramiko has to wait more than a minute for a
response to come from the server. Could this cause the library to fail? Can
we try to ping and then SSH once the ping succeeds as
test_network_basic_ops does?
2 - Is there a chance that key management might be flaky when running
parallel tests? I don't think so because otherwise we should see failures
even with nova-network, and that does not happen.

Salvatore


On 27 December 2013 08:14, IWAMOTO Toshihiro  wrote:

> At Fri, 27 Dec 2013 01:53:59 +0100,
> Salvatore Orlando wrote:
> >
> > [1  ]
> > [1.1  ]
> > I put together all the patches which we prepared for making parallel
> > testing work, and ran a few times 'check experimental' on the gate to see
> > whether it worked or not.
> >
> > With parallel testing, the only really troubling issue are the scenario
> > tests which require to access a VM from a floating IP, and the new
> patches
> > we've squashed together in [1] should address this issue. However, the
> > result is that timeouts are still observed but with a different message
> [2].
> > I'm not really familiar with it, and I've never observed it in local
> > testing. I wonder if it just happens to be the usual problem about the
> > target host not being reachable, or if it is something specific to
> paramiko.
> >
> > Any hint would be appreciated, since from the logs is appears everything
> is
> > wired properly.
>
> It seems that a TCP connection has been established but paramiko
> failed get data from the server in time.  Does increasing paramiko
> timeout help?
>
> > [1] https://review.openstack.org/#/c/57420/
> > [2]
> >
> http://logs.openstack.org/20/57420/40/experimental/check-tempest-dsvm-neutron-isolated-parallel/a74bdc8/console.html#_2013-12-26_22_51_31_817
> > [1.2  ]
> >
> > [2  ]
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA][Neutron] About paramiko's "SSHException: Error reading SSH protocol banner"

2013-12-26 Thread IWAMOTO Toshihiro
At Fri, 27 Dec 2013 01:53:59 +0100,
Salvatore Orlando wrote:
> 
> [1  ]
> [1.1  ]
> I put together all the patches which we prepared for making parallel
> testing work, and ran a few times 'check experimental' on the gate to see
> whether it worked or not.
> 
> With parallel testing, the only really troubling issue are the scenario
> tests which require to access a VM from a floating IP, and the new patches
> we've squashed together in [1] should address this issue. However, the
> result is that timeouts are still observed but with a different message [2].
> I'm not really familiar with it, and I've never observed it in local
> testing. I wonder if it just happens to be the usual problem about the
> target host not being reachable, or if it is something specific to paramiko.
> 
> Any hint would be appreciated, since from the logs is appears everything is
> wired properly.

It seems that a TCP connection has been established but paramiko
failed get data from the server in time.  Does increasing paramiko
timeout help?

> [1] https://review.openstack.org/#/c/57420/
> [2]
> http://logs.openstack.org/20/57420/40/experimental/check-tempest-dsvm-neutron-isolated-parallel/a74bdc8/console.html#_2013-12-26_22_51_31_817
> [1.2  ]
> 
> [2  ]
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA][Neutron] About paramiko's "SSHException: Error reading SSH protocol banner"

2013-12-26 Thread Yair Fried
This might be completely off, in "isolated creds", a private network is created 
for each tenant, while the test already creates its own private tenant network, 
thereby changing the behavior from how it was intended to, and how it is in 
"simple" mode. Could this be related?
I have this patch addressing this - https://review.openstack.org/#/c/63886/
You could try and see if it makes any difference

Yair 

- Original Message -
From: "Salvatore Orlando" 
To: "OpenStack Development Mailing List" 
Sent: Friday, December 27, 2013 2:53:59 AM
Subject: [openstack-dev] [QA][Neutron] About paramiko's "SSHException: Error 
reading SSH protocol banner"



I put together all the patches which we prepared for making parallel testing 
work, and ran a few times 'check experimental' on the gate to see whether it 
worked or not. 


With parallel testing, the only really troubling issue are the scenario tests 
which require to access a VM from a floating IP, and the new patches we've 
squashed together in [1] should address this issue. However, the result is that 
timeouts are still observed but with a different message [2]. 
I'm not really familiar with it, and I've never observed it in local testing. I 
wonder if it just happens to be the usual problem about the target host not 
being reachable, or if it is something specific to paramiko. 


Any hint would be appreciated, since from the logs is appears everything is 
wired properly. 


Salvatore 


[1] https://review.openstack.org/#/c/57420/ 
[2] 
http://logs.openstack.org/20/57420/40/experimental/check-tempest-dsvm-neutron-isolated-parallel/a74bdc8/console.html#_2013-12-26_22_51_31_817
 
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [QA][Neutron] About paramiko's "SSHException: Error reading SSH protocol banner"

2013-12-26 Thread Salvatore Orlando
I put together all the patches which we prepared for making parallel
testing work, and ran a few times 'check experimental' on the gate to see
whether it worked or not.

With parallel testing, the only really troubling issue are the scenario
tests which require to access a VM from a floating IP, and the new patches
we've squashed together in [1] should address this issue. However, the
result is that timeouts are still observed but with a different message [2].
I'm not really familiar with it, and I've never observed it in local
testing. I wonder if it just happens to be the usual problem about the
target host not being reachable, or if it is something specific to paramiko.

Any hint would be appreciated, since from the logs is appears everything is
wired properly.

Salvatore

[1] https://review.openstack.org/#/c/57420/
[2]
http://logs.openstack.org/20/57420/40/experimental/check-tempest-dsvm-neutron-isolated-parallel/a74bdc8/console.html#_2013-12-26_22_51_31_817
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev