Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)

2015-08-21 Thread Jeremy Stanley
On 2015-08-21 01:10:22 + (+), Steven Dake (stdake) wrote:
[...]
 How large is /opt?
[...]

It appears at the moment HP Cloud gives us a 30GiB root filesystem
(vda1) and a 0.5TiB ephemeral disk (vdb). Rackspace on the other
hand provides a 40GB root filesystem (xvda1) and 80GB ephemeral disk
(xvde). If your jobs are using devstack-gate, have a look at
fix_disk_layout() in functions.sh for details on how we repartition,
format and mount ephemeral disks. If your job is not based on
devstack-gate, then you should be able to implement some similar
routines to duplicate this.
-- 
Jeremy Stanley

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)

2015-08-20 Thread Steven Dake (stdake)


On 8/13/15, 6:13 AM, Jeremy Stanley fu...@yuggoth.org wrote:

On 2015-08-13 19:38:07 +0800 (+0800), Kai Qiang Wu wrote:
 I did talked to infra, which I think it is resource issue, But
 they thought it is nova issue,
[...]

No, I said the error was being raised by Nova, so was not an error
coming _from_ the infrastructure we manage. If your jobs are more
resource-intensive than a typical devstack/tempest job, you'll want
to look at ways to scale them back.

 It is 20GB disk space, so failed for that.

Correct, we run jobs on resources donated by public service
providers. Some of them only provide a 20GB root disk. There's
generally an ephemeral disk mounted at /opt with additional space if
you can modify your job to leverage that for whatever is running out
of space.

How large is /opt?


 I think it is related with this, the jenkins allocated VM disk
 space is not large. I am curious why it failed so often recently.
 Does os-infra changed something ?

Nothing has been intentionally changed with our disk space on job
workers as far as I'm aware. Different workers have varying root
disk sizes depending on the provider where they were booted, but
they could be as small as 20GB so your job will need to take that
into account.

20gb isn¹t enough for Magnum¹s CI jobs.  We could link /var/lib/docker to
/opt if there is sufficient space there.

Regards,
-steve

-- 
Jeremy Stanley

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)

2015-08-13 Thread Kai Qiang Wu
Hi Tom,


I did talked to infra, which I think it is resource issue, But they thought
it is nova issue,


For we boot k8s bay, we use baymodel with falvor m1.small, you can find
devstack



+-+---+---+--+---+--+---+-+---+
| ID  | Name  | Memory_MB | Disk | Ephemeral | Swap | VCPUs |
RXTX_Factor | Is_Public |
+-+---+---+--+---+--+---+-+---+
| 1   | m1.tiny   | 512   | 1| 0 |  | 1 | 1.0
| True  |
| 2   | m1.small  | 2048  | 20   | 0 |  | 1 | 1.0
| True  |
| 3   | m1.medium | 4096  | 40   | 0 |  | 2 | 1.0
| True  |
| 4   | m1.large  | 8192  | 80   | 0 |  | 4 | 1.0
| True  |
| 42  | m1.nano   | 64| 0| 0 |  | 1 | 1.0
| True  |
| 451 | m1.heat   | 512   | 0| 0 |  | 1 | 1.0
| True  |
| 5   | m1.xlarge | 16384 | 160  | 0 |  | 8 | 1.0
| True  |
| 84  | m1.micro  | 128   | 0| 0 |  | 1 | 1.0
| True  |
+-+---+---+--+---+--+---+-+---+



From logs below:

[req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin]
(devstack-trusty-rax-dfw-4299602, devstack-trusty-rax-dfw-4299602)
ram:5172 disk:17408 io_ops:0 instances:1 does not have 20480 MB usable
disk, it only has 17408.0 MB usable disk. host_passes
/opt/stack/new/nova/nova/scheduler/filters/disk_filter.py:60
2015-08-13 08:26:15.218 INFO nova.filters
[req-e

It is 20GB disk space, so failed for that.


I think it is related with this, the jenkins allocated VM disk space is not
large.
I am curious why it failed so often recently.  Does os-infra changed
something ?




Thanks




Best Wishes,

Kai Qiang Wu (吴开强  Kennan)
IBM China System and Technology Lab, Beijing

E-mail: wk...@cn.ibm.com
Tel: 86-10-82451647
Address: Building 28(Ring Building), ZhongGuanCun Software Park,
 No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China
100193

Follow your heart. You are miracle!



From:   Tom Cammann tom.camm...@hp.com
To: OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Date:   08/13/2015 06:24 PM
Subject:[openstack-dev] [Magnum] Consistent functional test failures



Hi Team,

Wanted to let you know why we are having consistent functional test
failures in the gate.

This is being caused by Nova returning No valid host to heat:

2015-08-13 08:26:16.303 31543 INFO heat.engine.resource [-] CREATE:
Server kube_minion [12ab45ef-0177-4118-9ba0-3fffbc3c1d1a] Stack
testbay-y366b2atg6mm-kube_minions-cdlfyvhaximr-0-dufsjliqfoet
[b40f0c9f-cb54-4d75-86c3-8a9f347a27a6]
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource Traceback (most
recent call last):
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/resource.py, line 625, in
_action_recorder
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/resource.py, line 696, in _do_action
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield
self.action_handler_task(action, args=handler_args)
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/scheduler.py, line 320, in wrapper
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource step =
next(subtask)
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/resource.py, line 670, in
action_handler_task
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource while not
check(handler_data):
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/resources/openstack/nova/server.py,
line 759, in check_create_complete
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource return
self.client_plugin()._check_active(server_id)
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File
/opt/stack/new/heat/heat/engine/clients/os/nova.py, line 232, in
_check_active
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource 'code':
fault.get('code', _('Unknown'))
2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource
ResourceInError: Went to status ERROR due to Message: No valid host was
found. There are not enough hosts available., Code: 500

And this in turn is being caused by the compute instance running out of
disk space:

2015-08-13 08:26:15.216 DEBUG nova.filters
[req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Starting with 1
host(s) get_filtered_objects /opt/stack/new/nova/nova/filters.py:70
2015-08-13 08:26:15.217 DEBUG nova.filters

Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)

2015-08-13 Thread Jeremy Stanley
On 2015-08-13 19:38:07 +0800 (+0800), Kai Qiang Wu wrote:
 I did talked to infra, which I think it is resource issue, But
 they thought it is nova issue,
[...]

No, I said the error was being raised by Nova, so was not an error
coming _from_ the infrastructure we manage. If your jobs are more
resource-intensive than a typical devstack/tempest job, you'll want
to look at ways to scale them back.

 It is 20GB disk space, so failed for that.

Correct, we run jobs on resources donated by public service
providers. Some of them only provide a 20GB root disk. There's
generally an ephemeral disk mounted at /opt with additional space if
you can modify your job to leverage that for whatever is running out
of space.

 I think it is related with this, the jenkins allocated VM disk
 space is not large. I am curious why it failed so often recently.
 Does os-infra changed something ?

Nothing has been intentionally changed with our disk space on job
workers as far as I'm aware. Different workers have varying root
disk sizes depending on the provider where they were booted, but
they could be as small as 20GB so your job will need to take that
into account.
-- 
Jeremy Stanley

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev