Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)
On 2015-08-21 01:10:22 + (+), Steven Dake (stdake) wrote: [...] How large is /opt? [...] It appears at the moment HP Cloud gives us a 30GiB root filesystem (vda1) and a 0.5TiB ephemeral disk (vdb). Rackspace on the other hand provides a 40GB root filesystem (xvda1) and 80GB ephemeral disk (xvde). If your jobs are using devstack-gate, have a look at fix_disk_layout() in functions.sh for details on how we repartition, format and mount ephemeral disks. If your job is not based on devstack-gate, then you should be able to implement some similar routines to duplicate this. -- Jeremy Stanley __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)
On 8/13/15, 6:13 AM, Jeremy Stanley fu...@yuggoth.org wrote: On 2015-08-13 19:38:07 +0800 (+0800), Kai Qiang Wu wrote: I did talked to infra, which I think it is resource issue, But they thought it is nova issue, [...] No, I said the error was being raised by Nova, so was not an error coming _from_ the infrastructure we manage. If your jobs are more resource-intensive than a typical devstack/tempest job, you'll want to look at ways to scale them back. It is 20GB disk space, so failed for that. Correct, we run jobs on resources donated by public service providers. Some of them only provide a 20GB root disk. There's generally an ephemeral disk mounted at /opt with additional space if you can modify your job to leverage that for whatever is running out of space. How large is /opt? I think it is related with this, the jenkins allocated VM disk space is not large. I am curious why it failed so often recently. Does os-infra changed something ? Nothing has been intentionally changed with our disk space on job workers as far as I'm aware. Different workers have varying root disk sizes depending on the provider where they were booted, but they could be as small as 20GB so your job will need to take that into account. 20gb isn¹t enough for Magnum¹s CI jobs. We could link /var/lib/docker to /opt if there is sufficient space there. Regards, -steve -- Jeremy Stanley __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)
Hi Tom, I did talked to infra, which I think it is resource issue, But they thought it is nova issue, For we boot k8s bay, we use baymodel with falvor m1.small, you can find devstack +-+---+---+--+---+--+---+-+---+ | ID | Name | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public | +-+---+---+--+---+--+---+-+---+ | 1 | m1.tiny | 512 | 1| 0 | | 1 | 1.0 | True | | 2 | m1.small | 2048 | 20 | 0 | | 1 | 1.0 | True | | 3 | m1.medium | 4096 | 40 | 0 | | 2 | 1.0 | True | | 4 | m1.large | 8192 | 80 | 0 | | 4 | 1.0 | True | | 42 | m1.nano | 64| 0| 0 | | 1 | 1.0 | True | | 451 | m1.heat | 512 | 0| 0 | | 1 | 1.0 | True | | 5 | m1.xlarge | 16384 | 160 | 0 | | 8 | 1.0 | True | | 84 | m1.micro | 128 | 0| 0 | | 1 | 1.0 | True | +-+---+---+--+---+--+---+-+---+ From logs below: [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] (devstack-trusty-rax-dfw-4299602, devstack-trusty-rax-dfw-4299602) ram:5172 disk:17408 io_ops:0 instances:1 does not have 20480 MB usable disk, it only has 17408.0 MB usable disk. host_passes /opt/stack/new/nova/nova/scheduler/filters/disk_filter.py:60 2015-08-13 08:26:15.218 INFO nova.filters [req-e It is 20GB disk space, so failed for that. I think it is related with this, the jenkins allocated VM disk space is not large. I am curious why it failed so often recently. Does os-infra changed something ? Thanks Best Wishes, Kai Qiang Wu (吴开强 Kennan) IBM China System and Technology Lab, Beijing E-mail: wk...@cn.ibm.com Tel: 86-10-82451647 Address: Building 28(Ring Building), ZhongGuanCun Software Park, No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193 Follow your heart. You are miracle! From: Tom Cammann tom.camm...@hp.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: 08/13/2015 06:24 PM Subject:[openstack-dev] [Magnum] Consistent functional test failures Hi Team, Wanted to let you know why we are having consistent functional test failures in the gate. This is being caused by Nova returning No valid host to heat: 2015-08-13 08:26:16.303 31543 INFO heat.engine.resource [-] CREATE: Server kube_minion [12ab45ef-0177-4118-9ba0-3fffbc3c1d1a] Stack testbay-y366b2atg6mm-kube_minions-cdlfyvhaximr-0-dufsjliqfoet [b40f0c9f-cb54-4d75-86c3-8a9f347a27a6] 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource Traceback (most recent call last): 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File /opt/stack/new/heat/heat/engine/resource.py, line 625, in _action_recorder 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File /opt/stack/new/heat/heat/engine/resource.py, line 696, in _do_action 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource yield self.action_handler_task(action, args=handler_args) 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File /opt/stack/new/heat/heat/engine/scheduler.py, line 320, in wrapper 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource step = next(subtask) 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File /opt/stack/new/heat/heat/engine/resource.py, line 670, in action_handler_task 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource while not check(handler_data): 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File /opt/stack/new/heat/heat/engine/resources/openstack/nova/server.py, line 759, in check_create_complete 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource return self.client_plugin()._check_active(server_id) 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File /opt/stack/new/heat/heat/engine/clients/os/nova.py, line 232, in _check_active 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource 'code': fault.get('code', _('Unknown')) 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource ResourceInError: Went to status ERROR due to Message: No valid host was found. There are not enough hosts available., Code: 500 And this in turn is being caused by the compute instance running out of disk space: 2015-08-13 08:26:15.216 DEBUG nova.filters [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Starting with 1 host(s) get_filtered_objects /opt/stack/new/nova/nova/filters.py:70 2015-08-13 08:26:15.217 DEBUG nova.filters
Re: [openstack-dev] [Magnum] Consistent functional test failures (seems infra not have enough resource)
On 2015-08-13 19:38:07 +0800 (+0800), Kai Qiang Wu wrote: I did talked to infra, which I think it is resource issue, But they thought it is nova issue, [...] No, I said the error was being raised by Nova, so was not an error coming _from_ the infrastructure we manage. If your jobs are more resource-intensive than a typical devstack/tempest job, you'll want to look at ways to scale them back. It is 20GB disk space, so failed for that. Correct, we run jobs on resources donated by public service providers. Some of them only provide a 20GB root disk. There's generally an ephemeral disk mounted at /opt with additional space if you can modify your job to leverage that for whatever is running out of space. I think it is related with this, the jenkins allocated VM disk space is not large. I am curious why it failed so often recently. Does os-infra changed something ? Nothing has been intentionally changed with our disk space on job workers as far as I'm aware. Different workers have varying root disk sizes depending on the provider where they were booted, but they could be as small as 20GB so your job will need to take that into account. -- Jeremy Stanley __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev