Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)

2015-08-21 Thread Jeremy Stanley
On 2015-08-21 01:10:22 + (+), Steven Dake (stdake) wrote:
[...]
 How large is /opt?
[...]

It appears at the moment HP Cloud gives us a 30GiB root filesystem
(vda1) and a 0.5TiB ephemeral disk (vdb). Rackspace on the other
hand provides a 40GB root filesystem (xvda1) and 80GB ephemeral disk
(xvde). If your jobs are using devstack-gate, have a look at
fix_disk_layout() in functions.sh for details on how we repartition,
format and mount ephemeral disks. If your job is not based on
devstack-gate, then you should be able to implement some similar
routines to duplicate this.
-- 
Jeremy Stanley

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum] Consistent functional test failures

2015-08-20 Thread Steven Dake (stdake)


On 8/13/15, 4:58 PM, Clark Boylan cboy...@sapwetik.org wrote:

On Thu, Aug 13, 2015, at 03:13 AM, Tom Cammann wrote:
 Hi Team,
 
 Wanted to let you know why we are having consistent functional test
 failures in the gate.
 
 This is being caused by Nova returning No valid host to heat:
 
 2015-08-13 08:26:16.303 31543 INFO heat.engine.resource [-] CREATE:
 Server kube_minion [12ab45ef-0177-4118-9ba0-3fffbc3c1d1a] Stack
 testbay-y366b2atg6mm-kube_minions-cdlfyvhaximr-0-dufsjliqfoet
 [b40f0c9f-cb54-4d75-86c3-8a9f347a27a6]
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource Traceback
(most 
 recent call last):
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resource.py, line 625, in
 _action_recorder
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resource.py, line 696, in _do_action
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
 self.action_handler_task(action, args=handler_args)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/scheduler.py, line 320, in wrapper
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource step =
 next(subtask)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resource.py, line 670, in
 action_handler_task
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource while not
 check(handler_data):
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resources/openstack/nova/server.py,
 line 759, in check_create_complete
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource return
 self.client_plugin()._check_active(server_id)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/clients/os/nova.py, line 232, in
 _check_active
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource 'code':
 fault.get('code', _('Unknown'))
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource
 ResourceInError: Went to status ERROR due to Message: No valid host
was 
 found. There are not enough hosts available., Code: 500
 
 And this in turn is being caused by the compute instance running out of
 disk space:
 
 2015-08-13 08:26:15.216 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Starting with 1
 host(s) get_filtered_objects /opt/stack/new/nova/nova/filters.py:70
 2015-08-13 08:26:15.217 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter
 RetryFilter returned 1 host(s) get_filtered_objects
 /opt/stack/new/nova/nova/filters.py:84
 2015-08-13 08:26:15.217 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter
 AvailabilityZoneFilter returned 1 host(s) get_filtered_objects
 /opt/stack/new/nova/nova/filters.py:84
 2015-08-13 08:26:15.217 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter RamFilter
 returned 1 host(s) get_filtered_objects
 /opt/stack/new/nova/nova/filters.py:84
 2015-08-13 08:26:15.218 DEBUG nova.scheduler.filters.disk_filter
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin]
 (devstack-trusty-rax-dfw-4299602, devstack-trusty-rax-dfw-4299602)
 ram:5172 disk:17408 io_ops:0 instances:1 does not have 20480 MB usable
 disk, it only has 17408.0 MB usable disk. host_passes
 /opt/stack/new/nova/nova/scheduler/filters/disk_filter.py:60
 2015-08-13 08:26:15.218 INFO nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter
DiskFilter 
 returned 0 hosts
 
 For now a recheck seems to work about 1 in 2, so we can still land
 patches.
 
 The fix for this could be to clean up our Magnum devstack install more
 aggressively, which might be as simple as cleaning up the images we
use, 
 or get infra to provide our tests with a larger disk size. I will
 probably test out a patch today which cleans up the images we use in
 devstack to see if that helps.
 
It is not trivial to provide your tests with more disk as we are using
the flavors appropriate for our RAM and CPU needs and are constrained by
quotas in the clouds we use. Do you really need 20GB nested test
instances? The VMs these jobs run on have ~13GB images which is almost
half the size of the instances you are trying to boot there. I would
definitely look into trimming the disk requirements for the nested VMs
before anything else.

As for working ~50% of the time hpcloud gives us more disk than
rackspace which is likely why you see about half fail and half pass. The
runs that pass probably run on hpcloud VMs.

In the short term, is there a way to request HP vms?

20gb won¹t do the job unfortunately.

Regards,
-steve


Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 

Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)

2015-08-20 Thread Steven Dake (stdake)


On 8/13/15, 6:13 AM, Jeremy Stanley fu...@yuggoth.org wrote:

On 2015-08-13 19:38:07 +0800 (+0800), Kai Qiang Wu wrote:
 I did talked to infra, which I think it is resource issue, But
 they thought it is nova issue,
[...]

No, I said the error was being raised by Nova, so was not an error
coming _from_ the infrastructure we manage. If your jobs are more
resource-intensive than a typical devstack/tempest job, you'll want
to look at ways to scale them back.

 It is 20GB disk space, so failed for that.

Correct, we run jobs on resources donated by public service
providers. Some of them only provide a 20GB root disk. There's
generally an ephemeral disk mounted at /opt with additional space if
you can modify your job to leverage that for whatever is running out
of space.

How large is /opt?


 I think it is related with this, the jenkins allocated VM disk
 space is not large. I am curious why it failed so often recently.
 Does os-infra changed something ?

Nothing has been intentionally changed with our disk space on job
workers as far as I'm aware. Different workers have varying root
disk sizes depending on the provider where they were booted, but
they could be as small as 20GB so your job will need to take that
into account.

20gb isn¹t enough for Magnum¹s CI jobs.  We could link /var/lib/docker to
/opt if there is sufficient space there.

Regards,
-steve

-- 
Jeremy Stanley

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum] Consistent functional test failures

2015-08-20 Thread Steven Dake (stdake)
Kai,

This sounds like a good solution.  The actual VM doesn’t need to be super large 
given our present tests.

Regards
-steve


From: Kai Qiang Wu wk...@cn.ibm.commailto:wk...@cn.ibm.com
Reply-To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Date: Friday, August 14, 2015 at 3:46 AM
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Magnum] Consistent functional test failures


I have checked with infra team members. For two instances, 10GB each should be 
OK.

So I add some steps to create magnum specific flavor(8 GB disk), instead of use 
the existed devstack flavors (m1.small needs 20GB, m1.tiny can not be used)

Magnum creates one for jenkins job and delete it when tests finished.


Thanks

Best Wishes,

Kai Qiang Wu (吴开强  Kennan)
IBM China System and Technology Lab, Beijing

E-mail: wk...@cn.ibm.commailto:wk...@cn.ibm.com
Tel: 86-10-82451647
Address: Building 28(Ring Building), ZhongGuanCun Software Park,
No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193

Follow your heart. You are miracle!

[Inactive hide details for Clark Boylan ---08/14/2015 08:05:15 AM---On Thu, Aug 
13, 2015, at 03:13 AM, Tom Cammann wrote:  Hi T]Clark Boylan ---08/14/2015 
08:05:15 AM---On Thu, Aug 13, 2015, at 03:13 AM, Tom Cammann wrote:  Hi Team,

From: Clark Boylan cboy...@sapwetik.orgmailto:cboy...@sapwetik.org
To: openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Date: 08/14/2015 08:05 AM
Subject: Re: [openstack-dev] [Magnum] Consistent functional test failures





On Thu, Aug 13, 2015, at 03:13 AM, Tom Cammann wrote:
 Hi Team,

 Wanted to let you know why we are having consistent functional test
 failures in the gate.

 This is being caused by Nova returning No valid host to heat:

 2015-08-13 08:26:16.303 31543 INFO heat.engine.resource [-] CREATE:
 Server kube_minion [12ab45ef-0177-4118-9ba0-3fffbc3c1d1a] Stack
 testbay-y366b2atg6mm-kube_minions-cdlfyvhaximr-0-dufsjliqfoet
 [b40f0c9f-cb54-4d75-86c3-8a9f347a27a6]
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource Traceback (most
 recent call last):
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resource.py, line 625, in
 _action_recorder
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resource.py, line 696, in _do_action
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
 self.action_handler_task(action, args=handler_args)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/scheduler.py, line 320, in wrapper
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource step =
 next(subtask)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resource.py, line 670, in
 action_handler_task
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource while not
 check(handler_data):
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resources/openstack/nova/server.py,
 line 759, in check_create_complete
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource return
 self.client_plugin()._check_active(server_id)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/clients/os/nova.py, line 232, in
 _check_active
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource 'code':
 fault.get('code', _('Unknown'))
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource
 ResourceInError: Went to status ERROR due to Message: No valid host was
 found. There are not enough hosts available., Code: 500

 And this in turn is being caused by the compute instance running out of
 disk space:

 2015-08-13 08:26:15.216 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Starting with 1
 host(s) get_filtered_objects /opt/stack/new/nova/nova/filters.py:70
 2015-08-13 08:26:15.217 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter
 RetryFilter returned 1 host(s) get_filtered_objects
 /opt/stack/new/nova/nova/filters.py:84
 2015-08-13 08:26:15.217 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter
 AvailabilityZoneFilter returned 1 host(s) get_filtered_objects
 /opt/stack/new/nova/nova/filters.py:84
 2015-08-13 08:26:15.217 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter RamFilter
 returned 1 host(s) get_filtered_objects
 /opt/stack/new/nova/nova/filters.py:84

Re: [openstack-dev] [Magnum] Consistent functional test failures

2015-08-14 Thread Kai Qiang Wu
I have checked with infra team members. For two instances, 10GB each should
be OK.

So I add some steps to create magnum specific flavor(8 GB disk), instead of
use the existed devstack flavors (m1.small needs 20GB, m1.tiny can not be
used)

Magnum creates one for jenkins job and delete it when tests finished.


Thanks

Best Wishes,

Kai Qiang Wu (吴开强  Kennan)
IBM China System and Technology Lab, Beijing

E-mail: wk...@cn.ibm.com
Tel: 86-10-82451647
Address: Building 28(Ring Building), ZhongGuanCun Software Park,
 No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China
100193

Follow your heart. You are miracle!



From:   Clark Boylan cboy...@sapwetik.org
To: openstack-dev@lists.openstack.org
Date:   08/14/2015 08:05 AM
Subject:Re: [openstack-dev] [Magnum] Consistent functional test
failures



On Thu, Aug 13, 2015, at 03:13 AM, Tom Cammann wrote:
 Hi Team,

 Wanted to let you know why we are having consistent functional test
 failures in the gate.

 This is being caused by Nova returning No valid host to heat:

 2015-08-13 08:26:16.303 31543 INFO heat.engine.resource [-] CREATE:
 Server kube_minion [12ab45ef-0177-4118-9ba0-3fffbc3c1d1a] Stack
 testbay-y366b2atg6mm-kube_minions-cdlfyvhaximr-0-dufsjliqfoet
 [b40f0c9f-cb54-4d75-86c3-8a9f347a27a6]
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource Traceback (most
 recent call last):
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resource.py, line 625, in
 _action_recorder
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resource.py, line 696, in _do_action
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
 self.action_handler_task(action, args=handler_args)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/scheduler.py, line 320, in wrapper
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource step =
 next(subtask)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resource.py, line 670, in
 action_handler_task
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource while not
 check(handler_data):
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/resources/openstack/nova/server.py,
 line 759, in check_create_complete
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource return
 self.client_plugin()._check_active(server_id)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
 /opt/stack/new/heat/heat/engine/clients/os/nova.py, line 232, in
 _check_active
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource 'code':
 fault.get('code', _('Unknown'))
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource
 ResourceInError: Went to status ERROR due to Message: No valid host was
 found. There are not enough hosts available., Code: 500

 And this in turn is being caused by the compute instance running out of
 disk space:

 2015-08-13 08:26:15.216 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Starting with 1
 host(s) get_filtered_objects /opt/stack/new/nova/nova/filters.py:70
 2015-08-13 08:26:15.217 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter
 RetryFilter returned 1 host(s) get_filtered_objects
 /opt/stack/new/nova/nova/filters.py:84
 2015-08-13 08:26:15.217 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter
 AvailabilityZoneFilter returned 1 host(s) get_filtered_objects
 /opt/stack/new/nova/nova/filters.py:84
 2015-08-13 08:26:15.217 DEBUG nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter RamFilter
 returned 1 host(s) get_filtered_objects
 /opt/stack/new/nova/nova/filters.py:84
 2015-08-13 08:26:15.218 DEBUG nova.scheduler.filters.disk_filter
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin]
 (devstack-trusty-rax-dfw-4299602, devstack-trusty-rax-dfw-4299602)
 ram:5172 disk:17408 io_ops:0 instances:1 does not have 20480 MB usable
 disk, it only has 17408.0 MB usable disk. host_passes
 /opt/stack/new/nova/nova/scheduler/filters/disk_filter.py:60
 2015-08-13 08:26:15.218 INFO nova.filters
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter DiskFilter
 returned 0 hosts

 For now a recheck seems to work about 1 in 2, so we can still land
 patches.

 The fix for this could be to clean up our Magnum devstack install more
 aggressively, which might be as simple as cleaning up the images we use,
 or get infra to provide our tests with a larger disk size. I will
 probably test out a patch today which cleans up the images we use in
 devstack to see if that helps

Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)

2015-08-13 Thread Kai Qiang Wu
Hi Tom,


I did talked to infra, which I think it is resource issue, But they thought
it is nova issue,


For we boot k8s bay, we use baymodel with falvor m1.small, you can find
devstack



+-+---+---+--+---+--+---+-+---+
| ID  | Name  | Memory_MB | Disk | Ephemeral | Swap | VCPUs |
RXTX_Factor | Is_Public |
+-+---+---+--+---+--+---+-+---+
| 1   | m1.tiny   | 512   | 1| 0 |  | 1 | 1.0
| True  |
| 2   | m1.small  | 2048  | 20   | 0 |  | 1 | 1.0
| True  |
| 3   | m1.medium | 4096  | 40   | 0 |  | 2 | 1.0
| True  |
| 4   | m1.large  | 8192  | 80   | 0 |  | 4 | 1.0
| True  |
| 42  | m1.nano   | 64| 0| 0 |  | 1 | 1.0
| True  |
| 451 | m1.heat   | 512   | 0| 0 |  | 1 | 1.0
| True  |
| 5   | m1.xlarge | 16384 | 160  | 0 |  | 8 | 1.0
| True  |
| 84  | m1.micro  | 128   | 0| 0 |  | 1 | 1.0
| True  |
+-+---+---+--+---+--+---+-+---+



From logs below:

[req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin]
(devstack-trusty-rax-dfw-4299602, devstack-trusty-rax-dfw-4299602)
ram:5172 disk:17408 io_ops:0 instances:1 does not have 20480 MB usable
disk, it only has 17408.0 MB usable disk. host_passes
/opt/stack/new/nova/nova/scheduler/filters/disk_filter.py:60
2015-08-13 08:26:15.218 INFO nova.filters
[req-e

It is 20GB disk space, so failed for that.


I think it is related with this, the jenkins allocated VM disk space is not
large.
I am curious why it failed so often recently.  Does os-infra changed
something ?




Thanks




Best Wishes,

Kai Qiang Wu (吴开强  Kennan)
IBM China System and Technology Lab, Beijing

E-mail: wk...@cn.ibm.com
Tel: 86-10-82451647
Address: Building 28(Ring Building), ZhongGuanCun Software Park,
 No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China
100193

Follow your heart. You are miracle!



From:   Tom Cammann tom.camm...@hp.com
To: OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Date:   08/13/2015 06:24 PM
Subject:[openstack-dev] [Magnum] Consistent functional test failures



Hi Team,

Wanted to let you know why we are having consistent functional test
failures in the gate.

This is being caused by Nova returning No valid host to heat:

2015-08-13 08:26:16.303 31543 INFO heat.engine.resource [-] CREATE:
Server kube_minion [12ab45ef-0177-4118-9ba0-3fffbc3c1d1a] Stack
testbay-y366b2atg6mm-kube_minions-cdlfyvhaximr-0-dufsjliqfoet
[b40f0c9f-cb54-4d75-86c3-8a9f347a27a6]
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource Traceback (most
recent call last):
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/resource.py, line 625, in
_action_recorder
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/resource.py, line 696, in _do_action
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
self.action_handler_task(action, args=handler_args)
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/scheduler.py, line 320, in wrapper
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource step =
next(subtask)
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/resource.py, line 670, in
action_handler_task
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource while not
check(handler_data):
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/resources/openstack/nova/server.py,
line 759, in check_create_complete
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource return
self.client_plugin()._check_active(server_id)
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/clients/os/nova.py, line 232, in
_check_active
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource 'code':
fault.get('code', _('Unknown'))
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource
ResourceInError: Went to status ERROR due to Message: No valid host was
found. There are not enough hosts available., Code: 500

And this in turn is being caused by the compute instance running out of
disk space:

2015-08-13 08:26:15.216 DEBUG nova.filters
[req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Starting with 1
host(s) get_filtered_objects /opt/stack/new/nova/nova/filters.py:70
2015-08-13 08:26:15.217 DEBUG nova.filters
[req-e5bb52cb-387e

[openstack-dev] [Magnum] Consistent functional test failures

2015-08-13 Thread Tom Cammann

Hi Team,

Wanted to let you know why we are having consistent functional test 
failures in the gate.


This is being caused by Nova returning No valid host to heat:

2015-08-13 08:26:16.303 31543 INFO heat.engine.resource [-] CREATE: 
Server kube_minion [12ab45ef-0177-4118-9ba0-3fffbc3c1d1a] Stack 
testbay-y366b2atg6mm-kube_minions-cdlfyvhaximr-0-dufsjliqfoet 
[b40f0c9f-cb54-4d75-86c3-8a9f347a27a6]
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource Traceback (most 
recent call last):
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
/opt/stack/new/heat/heat/engine/resource.py, line 625, in _action_recorder

2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
/opt/stack/new/heat/heat/engine/resource.py, line 696, in _do_action
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield 
self.action_handler_task(action, args=handler_args)
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
/opt/stack/new/heat/heat/engine/scheduler.py, line 320, in wrapper
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource step = 
next(subtask)
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
/opt/stack/new/heat/heat/engine/resource.py, line 670, in 
action_handler_task
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource while not 
check(handler_data):
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
/opt/stack/new/heat/heat/engine/resources/openstack/nova/server.py, 
line 759, in check_create_complete
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource return 
self.client_plugin()._check_active(server_id)
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
/opt/stack/new/heat/heat/engine/clients/os/nova.py, line 232, in 
_check_active
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource 'code': 
fault.get('code', _('Unknown'))
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource 
ResourceInError: Went to status ERROR due to Message: No valid host was 
found. There are not enough hosts available., Code: 500


And this in turn is being caused by the compute instance running out of 
disk space:


2015-08-13 08:26:15.216 DEBUG nova.filters 
[req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Starting with 1 
host(s) get_filtered_objects /opt/stack/new/nova/nova/filters.py:70
2015-08-13 08:26:15.217 DEBUG nova.filters 
[req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter 
RetryFilter returned 1 host(s) get_filtered_objects 
/opt/stack/new/nova/nova/filters.py:84
2015-08-13 08:26:15.217 DEBUG nova.filters 
[req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter 
AvailabilityZoneFilter returned 1 host(s) get_filtered_objects 
/opt/stack/new/nova/nova/filters.py:84
2015-08-13 08:26:15.217 DEBUG nova.filters 
[req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter RamFilter 
returned 1 host(s) get_filtered_objects 
/opt/stack/new/nova/nova/filters.py:84
2015-08-13 08:26:15.218 DEBUG nova.scheduler.filters.disk_filter 
[req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] 
(devstack-trusty-rax-dfw-4299602, devstack-trusty-rax-dfw-4299602) 
ram:5172 disk:17408 io_ops:0 instances:1 does not have 20480 MB usable 
disk, it only has 17408.0 MB usable disk. host_passes 
/opt/stack/new/nova/nova/scheduler/filters/disk_filter.py:60
2015-08-13 08:26:15.218 INFO nova.filters 
[req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter DiskFilter 
returned 0 hosts


For now a recheck seems to work about 1 in 2, so we can still land patches.

The fix for this could be to clean up our Magnum devstack install more 
aggressively, which might be as simple as cleaning up the images we use, 
or get infra to provide our tests with a larger disk size. I will 
probably test out a patch today which cleans up the images we use in 
devstack to see if that helps.


If anyone can help progress this let me know.

Cheers,
Tom



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)

2015-08-13 Thread Jeremy Stanley
On 2015-08-13 19:38:07 +0800 (+0800), Kai Qiang Wu wrote:
 I did talked to infra, which I think it is resource issue, But
 they thought it is nova issue,
[...]

No, I said the error was being raised by Nova, so was not an error
coming _from_ the infrastructure we manage. If your jobs are more
resource-intensive than a typical devstack/tempest job, you'll want
to look at ways to scale them back.

 It is 20GB disk space, so failed for that.

Correct, we run jobs on resources donated by public service
providers. Some of them only provide a 20GB root disk. There's
generally an ephemeral disk mounted at /opt with additional space if
you can modify your job to leverage that for whatever is running out
of space.

 I think it is related with this, the jenkins allocated VM disk
 space is not large. I am curious why it failed so often recently.
 Does os-infra changed something ?

Nothing has been intentionally changed with our disk space on job
workers as far as I'm aware. Different workers have varying root
disk sizes depending on the provider where they were booted, but
they could be as small as 20GB so your job will need to take that
into account.
-- 
Jeremy Stanley

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum] Consistent functional test failures

2015-08-13 Thread Clark Boylan
On Thu, Aug 13, 2015, at 03:13 AM, Tom Cammann wrote:
 Hi Team,
 
 Wanted to let you know why we are having consistent functional test 
 failures in the gate.
 
 This is being caused by Nova returning No valid host to heat:
 
 2015-08-13 08:26:16.303 31543 INFO heat.engine.resource [-] CREATE: 
 Server kube_minion [12ab45ef-0177-4118-9ba0-3fffbc3c1d1a] Stack 
 testbay-y366b2atg6mm-kube_minions-cdlfyvhaximr-0-dufsjliqfoet 
 [b40f0c9f-cb54-4d75-86c3-8a9f347a27a6]
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource Traceback (most 
 recent call last):
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
 /opt/stack/new/heat/heat/engine/resource.py, line 625, in
 _action_recorder
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
 /opt/stack/new/heat/heat/engine/resource.py, line 696, in _do_action
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield 
 self.action_handler_task(action, args=handler_args)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
 /opt/stack/new/heat/heat/engine/scheduler.py, line 320, in wrapper
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource step = 
 next(subtask)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
 /opt/stack/new/heat/heat/engine/resource.py, line 670, in 
 action_handler_task
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource while not 
 check(handler_data):
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
 /opt/stack/new/heat/heat/engine/resources/openstack/nova/server.py, 
 line 759, in check_create_complete
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource return 
 self.client_plugin()._check_active(server_id)
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
 /opt/stack/new/heat/heat/engine/clients/os/nova.py, line 232, in 
 _check_active
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource 'code': 
 fault.get('code', _('Unknown'))
 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource 
 ResourceInError: Went to status ERROR due to Message: No valid host was 
 found. There are not enough hosts available., Code: 500
 
 And this in turn is being caused by the compute instance running out of 
 disk space:
 
 2015-08-13 08:26:15.216 DEBUG nova.filters 
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Starting with 1 
 host(s) get_filtered_objects /opt/stack/new/nova/nova/filters.py:70
 2015-08-13 08:26:15.217 DEBUG nova.filters 
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter 
 RetryFilter returned 1 host(s) get_filtered_objects 
 /opt/stack/new/nova/nova/filters.py:84
 2015-08-13 08:26:15.217 DEBUG nova.filters 
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter 
 AvailabilityZoneFilter returned 1 host(s) get_filtered_objects 
 /opt/stack/new/nova/nova/filters.py:84
 2015-08-13 08:26:15.217 DEBUG nova.filters 
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter RamFilter 
 returned 1 host(s) get_filtered_objects 
 /opt/stack/new/nova/nova/filters.py:84
 2015-08-13 08:26:15.218 DEBUG nova.scheduler.filters.disk_filter 
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] 
 (devstack-trusty-rax-dfw-4299602, devstack-trusty-rax-dfw-4299602) 
 ram:5172 disk:17408 io_ops:0 instances:1 does not have 20480 MB usable 
 disk, it only has 17408.0 MB usable disk. host_passes 
 /opt/stack/new/nova/nova/scheduler/filters/disk_filter.py:60
 2015-08-13 08:26:15.218 INFO nova.filters 
 [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter DiskFilter 
 returned 0 hosts
 
 For now a recheck seems to work about 1 in 2, so we can still land
 patches.
 
 The fix for this could be to clean up our Magnum devstack install more 
 aggressively, which might be as simple as cleaning up the images we use, 
 or get infra to provide our tests with a larger disk size. I will 
 probably test out a patch today which cleans up the images we use in 
 devstack to see if that helps.
 
It is not trivial to provide your tests with more disk as we are using
the flavors appropriate for our RAM and CPU needs and are constrained by
quotas in the clouds we use. Do you really need 20GB nested test
instances? The VMs these jobs run on have ~13GB images which is almost
half the size of the instances you are trying to boot there. I would
definitely look into trimming the disk requirements for the nested VMs
before anything else.

As for working ~50% of the time hpcloud gives us more disk than
rackspace which is likely why you see about half fail and half pass. The
runs that pass probably run on hpcloud VMs.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev