Re: [openstack-dev] Does the openstack ci vms start each time clear up enough?

2018-05-06 Thread Vega Cai
To test whether it's our new patch that causes the problem, I submitted a
dummy patch[1] to trigger CI and the CI failed again. Checking the log of
nova scheduler, it's very strange that the scheduling starts with 0 host at
the beginning.

May 06 09:40:34.358585

ubuntu-xenial-inap-mtl01-0003885152 nova-scheduler[21962]: DEBUG
oslo_service.periodic_task [None
req-008ee30a-47a1-40a2-bf64-cb0f1719806e None None] Running periodic
task SchedulerManager._run_periodic_tasks {{(pid=23795)
run_periodic_tasks
/usr/local/lib/python2.7/dist-packages/oslo_service/periodic_task.py:215}}May
06 09:41:23.968029

ubuntu-xenial-inap-mtl01-0003885152 nova-scheduler[21962]: DEBUG
nova.scheduler.manager [None req-c67986fa-2e3b-45b7-96dd-196704945b95
admin admin] *Starting to schedule for instances:
[u'8b227e85-8959-4e07-be3d-1bc094c115c1']* {{(pid=23795)
select_destinations
/opt/stack/new/nova/nova/scheduler/manager.py:118}}May 06
09:41:23.969293

ubuntu-xenial-inap-mtl01-0003885152 nova-scheduler[21962]: DEBUG
oslo_concurrency.lockutils [None
req-c67986fa-2e3b-45b7-96dd-196704945b95 admin admin] Lock
"placement_client" acquired by
"nova.scheduler.client.report._create_client" :: waited 0.000s
{{(pid=23795) inner
/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:273}}May
06 09:41:23.975304

ubuntu-xenial-inap-mtl01-0003885152 nova-scheduler[21962]: DEBUG
oslo_concurrency.lockutils [None
req-c67986fa-2e3b-45b7-96dd-196704945b95 admin admin] Lock
"placement_client" released by
"nova.scheduler.client.report._create_client" :: held 0.006s
{{(pid=23795) inner
/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:285}}May
06 09:41:24.276470

ubuntu-xenial-inap-mtl01-0003885152 nova-scheduler[21962]: DEBUG
oslo_concurrency.lockutils [None
req-c67986fa-2e3b-45b7-96dd-196704945b95 admin admin] Lock
"6e118c71-9008-4694-8aee-faa607944c5f" acquired by
"nova.context.get_or_set_cached_cell_and_set_connections" :: waited
0.000s {{(pid=23795) inner
/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:273}}May
06 09:41:24.279331

ubuntu-xenial-inap-mtl01-0003885152 nova-scheduler[21962]: DEBUG
oslo_concurrency.lockutils [None
req-c67986fa-2e3b-45b7-96dd-196704945b95 admin admin] Lock
"6e118c71-9008-4694-8aee-faa607944c5f" released by
"nova.context.get_or_set_cached_cell_and_set_connections" :: held
0.003s {{(pid=23795) inner
/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:285}}May
06 09:41:24.302854

ubuntu-xenial-inap-mtl01-0003885152 nova-scheduler[21962]: DEBUG
oslo_db.sqlalchemy.engines [None
req-c67986fa-2e3b-45b7-96dd-196704945b95 admin admin] MySQL server
mode set to 
STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
{{(pid=23795) _check_effective_sql_mode
/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/engines.py:308}}May
06 09:41:24.321713

ubuntu-xenial-inap-mtl01-0003885152 nova-scheduler[21962]: DEBUG
nova.filters [None req-c67986fa-2e3b-45b7-96dd-196704945b95 admin
admin] *Starting with 0 host(s)* {{(pid=23795) get_filtered_objects
/opt/stack/new/nova/nova/filters.py:70}}May 06 09:41:24.322136

ubuntu-xenial-inap-mtl01-0003885152 nova-scheduler[21962]: INFO
nova.filters [None req-c67986fa-2e3b-45b7-96dd-196704945b95 admin
admin] Filter RetryFilter returned 0 hostsMay 06 09:41:24.322614

ubuntu-xenial-inap-mtl01-0003885152 nova-scheduler[21962]: DEBUG
nova.filters [None req-c67986fa-2e3b-45b7-96dd-196704945b95 admin
admin] Filtering removed all hosts for the request with instance ID
'8b227e85-8959-4e07-be3d-1bc094c115c1'.

Re: [openstack-dev] Does the openstack ci vms start each time clear up enough?

2018-05-04 Thread Clark Boylan


On Fri, May 4, 2018, at 2:37 AM, Sean McGinnis wrote:
> On Fri, May 04, 2018 at 04:13:41PM +0800, linghucongsong wrote:
> > 
> > Hi all!
> > 
> > Recently we meet a strange problem in our ci. look this link: 
> > https://review.openstack.org/#/c/532097/
> > 
> > we can pass the ci in the first time, but when we begin to start the gate 
> > job, it will always failed in the second time.
> > 
> > we have rebased several times, it alway pass the ci in the first time and 
> > failed in the second time.
> > 
> > This have not happen before  and make me to guess is it really we start the 
> > ci from the new fresh vms each time?
> 
> A new VM is spun up for each test run, so I don't believe this is an issue 
> with
> stale artifacts on the host. I would guess this is more likely some sort of
> race condition, and you just happen to be hitting it 50% of the time.

Additionally you can check the job logs to see while these two jobs did run 
against the same cloud provider they did so in different regions on hosts with 
completely different IP addresses. The inventory files [0][1] are where I would 
start if you suspect oddness of this sort. Reading them I don't see anything to 
indicate the nodes were reused.

[0] 
http://logs.openstack.org/97/532097/16/check/legacy-tricircle-dsvm-multiregion/c9b3d29/zuul-info/inventory.yaml
[1] 
http://logs.openstack.org/97/532097/16/gate/legacy-tricircle-dsvm-multiregion/ad547d5/zuul-info/inventory.yaml

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Does the openstack ci vms start each time clear up enough?

2018-05-04 Thread Sean McGinnis
On Fri, May 04, 2018 at 04:13:41PM +0800, linghucongsong wrote:
> 
> Hi all!
> 
> Recently we meet a strange problem in our ci. look this link: 
> https://review.openstack.org/#/c/532097/
> 
> we can pass the ci in the first time, but when we begin to start the gate 
> job, it will always failed in the second time.
> 
> we have rebased several times, it alway pass the ci in the first time and 
> failed in the second time.
> 
> This have not happen before  and make me to guess is it really we start the 
> ci from the new fresh vms each time?

A new VM is spun up for each test run, so I don't believe this is an issue with
stale artifacts on the host. I would guess this is more likely some sort of
race condition, and you just happen to be hitting it 50% of the time.

You can probably keep rechecking the patch until you get lucky. But it would be
dangerous to do that without knowing the source of the failure. You will just
most likely keep running into this in future patches then.

The failure looks very odd, with the test expecting a status to be returned but
instead getting a ServerDetail object:

http://logs.openstack.org/97/532097/16/gate/legacy-tricircle-dsvm-multiregion/ad547d5/job-output.txt.gz#_2018-05-04_03_57_05_399493


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Does the openstack ci vms start each time clear up enough?

2018-05-04 Thread linghucongsong

Hi all!

Recently we meet a strange problem in our ci. look this link: 
https://review.openstack.org/#/c/532097/

we can pass the ci in the first time, but when we begin to start the gate job, 
it will always failed in the second time.

we have rebased several times, it alway pass the ci in the first time and 
failed in the second time.

This have not happen before  and make me to guess is it really we start the ci 
from the new fresh vms each time?
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev