Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

2015-12-21 Thread Jeremy Stanley
On 2015-12-20 12:35:34 -0800 (-0800), Clark Boylan wrote:
> Looking at the dstat logs for a recent fail [0], it did help in that
> more memory is available. You now have over 1GB available but still less
> than 2GB.
[...]

As Clark also pointed out in IRC, the workers where this is failing
lack a swap memory device. I have a feeling if you swapoff before
this point in the job you'll see it fail everywhere.

For consistency, we probably should make sure that our workers have
similarly sized swap devices (using a swapfile on the rootfs if
necessary).
-- 
Jeremy Stanley

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

2015-12-20 Thread Clark Boylan
Looking at the dstat logs for a recent fail [0], it did help in that
more memory is available. You now have over 1GB available but still less
than 2GB. I would try using less memory. Can you use a 1GB flavor
instead of a 2GB flavor?

[0]
http://logs.openstack.org/58/251158/4/check/gate-functional-dsvm-magnum-swarm/6b022cc/logs/screen-dstat.txt.gz

On Sun, Dec 20, 2015, at 12:08 PM, Hongbin Lu wrote:
> Hi Clark,
> 
> Thanks for the fix. Unfortunately, it doesn't seem to help. The error
> still occurred [1] after you increased the memory restriction, and as
> before, most of them occurred in OVH host. Any further suggestion?
> 
> [1] http://status.openstack.org/elastic-recheck/#1521237
> 
> Best regards,
> Hongbin
> 
> -Original Message-
> From: Clark Boylan [mailto:cboy...@sapwetik.org] 
> Sent: December-15-15 5:41 PM
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite
> often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"
> 
> On Sun, Dec 13, 2015, at 10:51 AM, Clark Boylan wrote:
> > On Sat, Dec 12, 2015, at 02:16 PM, Hongbin Lu wrote:
> > > Hi,
> > > 
> > > As Kai Qiang mentioned, magnum gate recently had a bunch of random 
> > > failures, which occurred on creating a nova instance with 2G of RAM.
> > > According to the error message, it seems that the hypervisor tried 
> > > to allocate memory to the nova instance but couldn’t find enough 
> > > free memory in the host. However, by adding a few “nova 
> > > hypervisor-show XX” before, during, and right after the test, it 
> > > showed that the host has 6G of free RAM, which is far more than 2G. 
> > > Here is a snapshot of the output [1]. You can find the full log here [2].
> > If you look at the dstat log
> > http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnu
> > m-k8s/5305d7a/logs/screen-dstat.txt.gz
> > the host has nowhere near 6GB free memory and less than 2GB. I think 
> > you actually are just running out of memory.
> > > 
> > > Another observation is that most of the failure happened on a node 
> > > with name “devstack-trusty-ovh-*” (You can verify it by entering a 
> > > query [3] at http://logstash.openstack.org/ ). It seems that the 
> > > jobs will be fine if they are allocated to a node other than “ovh”.
> > I have just done a quick spot check of the total memory on 
> > devstack-trusty hosts across HPCloud, Rackspace, and OVH using `free 
> > -m` and the results are 7480, 7732, and 6976 megabytes respectively. 
> > Despite using 8GB flavors in each case there is variation and OVH 
> > comes in on the low end for some reason. I am guessing that you fail 
> > here more often because the other hosts give you just enough extra 
> > memory to boot these VMs.
> To follow up on this we seem to have tracked this down to how the linux
> kernel restricts memory at boot when you don't have a contiguous chunk of
> system memory. We have worked around this by increasing the memory
> restriction to 9023M at boot which gets OVH inline with Rackspace and
> slightly increases available memory on HPCloud (because it actually has
> more of it).
> 
> You should see this fix in action after image builds complete tomorrow
> (they start at 1400UTC ish).
> > 
> > We will have to look into why OVH has less memory despite using 
> > flavors that should be roughly equivalent.
> > > 
> > > Any hints to debug this issue further? Suggestions are greatly 
> > > appreciated.
> > > 
> > > [1] http://paste.openstack.org/show/481746/
> > > [2]
> > > http://logs.openstack.org/48/256748/1/check/gate-functional-dsvm-mag
> > > num-swarm/56d79c3/console.html [3] 
> > > https://review.openstack.org/#/c/254370/2/queries/1521237.yaml
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

2015-12-20 Thread Hongbin Lu
Hi Clark,

Thanks for the fix. Unfortunately, it doesn't seem to help. The error still 
occurred [1] after you increased the memory restriction, and as before, most of 
them occurred in OVH host. Any further suggestion?

[1] http://status.openstack.org/elastic-recheck/#1521237

Best regards,
Hongbin

-Original Message-
From: Clark Boylan [mailto:cboy...@sapwetik.org] 
Sent: December-15-15 5:41 PM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often 
for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

On Sun, Dec 13, 2015, at 10:51 AM, Clark Boylan wrote:
> On Sat, Dec 12, 2015, at 02:16 PM, Hongbin Lu wrote:
> > Hi,
> > 
> > As Kai Qiang mentioned, magnum gate recently had a bunch of random 
> > failures, which occurred on creating a nova instance with 2G of RAM.
> > According to the error message, it seems that the hypervisor tried 
> > to allocate memory to the nova instance but couldn’t find enough 
> > free memory in the host. However, by adding a few “nova 
> > hypervisor-show XX” before, during, and right after the test, it 
> > showed that the host has 6G of free RAM, which is far more than 2G. 
> > Here is a snapshot of the output [1]. You can find the full log here [2].
> If you look at the dstat log
> http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnu
> m-k8s/5305d7a/logs/screen-dstat.txt.gz
> the host has nowhere near 6GB free memory and less than 2GB. I think 
> you actually are just running out of memory.
> > 
> > Another observation is that most of the failure happened on a node 
> > with name “devstack-trusty-ovh-*” (You can verify it by entering a 
> > query [3] at http://logstash.openstack.org/ ). It seems that the 
> > jobs will be fine if they are allocated to a node other than “ovh”.
> I have just done a quick spot check of the total memory on 
> devstack-trusty hosts across HPCloud, Rackspace, and OVH using `free 
> -m` and the results are 7480, 7732, and 6976 megabytes respectively. 
> Despite using 8GB flavors in each case there is variation and OVH 
> comes in on the low end for some reason. I am guessing that you fail 
> here more often because the other hosts give you just enough extra 
> memory to boot these VMs.
To follow up on this we seem to have tracked this down to how the linux kernel 
restricts memory at boot when you don't have a contiguous chunk of system 
memory. We have worked around this by increasing the memory restriction to 
9023M at boot which gets OVH inline with Rackspace and slightly increases 
available memory on HPCloud (because it actually has more of it).

You should see this fix in action after image builds complete tomorrow (they 
start at 1400UTC ish).
> 
> We will have to look into why OVH has less memory despite using 
> flavors that should be roughly equivalent.
> > 
> > Any hints to debug this issue further? Suggestions are greatly 
> > appreciated.
> > 
> > [1] http://paste.openstack.org/show/481746/
> > [2]
> > http://logs.openstack.org/48/256748/1/check/gate-functional-dsvm-mag
> > num-swarm/56d79c3/console.html [3] 
> > https://review.openstack.org/#/c/254370/2/queries/1521237.yaml


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

2015-12-15 Thread Egor Guz
Clark,


What about ephemeral storage at OVH vms? I see may storage related errors (see 
full output below) these days.
Basically it  means Docker cannot create storage device at local drive

-- Logs begin at Mon 2015-12-14 06:40:09 UTC, end at Mon 2015-12-14 07:00:38 
UTC. --
Dec 14 06:45:50 

 te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: Stopped 
Docker Application Container Engine.
Dec 14 06:47:54 

 te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: Starting 
Docker Application Container Engine...
Dec 14 06:48:00 

 te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]: Warning: 
'-d' is deprecated, it will be removed soon. See usage.
Dec 14 06:48:00 

 te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]: 
time="2015-12-14T06:48:00Z" level=warning msg="please use 'docker daemon' 
instead."
Dec 14 06:48:03 

 te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]: 
time="2015-12-14T06:48:03.447936206Z" level=info msg="Listening for HTTP on 
unix (/var/run/docker.sock)"
Dec 14 06:48:06 

 te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]: 
time="2015-12-14T06:48:06.280086735Z" level=fatal msg="Error starting daemon: 
error initializing graphdriver: Non existing device docker-docker--pool"
Dec 14 06:48:06 

 te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: 
docker.service: main process exited, code=exited, status=1/FAILURE
Dec 14 06:48:06 

 te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: Failed to 
start Docker Application Container Engine.
Dec 14 06:48:06 

 te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: Unit 
docker.service entered failed state.
Dec 14 06:48:06 

 te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: 
docker.service failed.


http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz


—
Egor




On 12/13/15, 10:51, "Clark Boylan"  wrote:

>On Sat, Dec 12, 2015, at 02:16 PM, Hongbin Lu wrote:
>> Hi,
>> 
>> As Kai Qiang mentioned, magnum gate recently had a bunch of random
>> failures, which occurred on creating a nova instance with 2G of RAM.
>> According to the error message, it seems that the hypervisor tried to
>> allocate memory to the nova instance but couldn’t find enough free memory
>> in the host. However, by adding a few “nova hypervisor-show XX” before,
>> during, and right after the test, it showed that the host has 6G of free
>> RAM, which is far more than 2G. Here is a snapshot of the output [1]. You
>> can find the full log here [2].
>If you look at the dstat log
>http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/screen-dstat.txt.gz
>the host has nowhere near 6GB free memory and less than 2GB. I think you
>actually are just running out of memory.
>> 
>> Another observation is that most of the failure happened on a node with
>> name “devstack-trusty-ovh-*” (You can verify it by entering a query [3]
>> at http://logstash.openstack.org/ ). It seems that the 

Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

2015-12-15 Thread Clark Boylan
There is no ephemeral drive you will have to use the root disk. Grabbing
a random VM in OVH and running a quick df on it you should have about
60GB of free space for the job to use there. 

Clark

On Tue, Dec 15, 2015, at 02:00 PM, Egor Guz wrote:
> Clark,
> 
> 
> What about ephemeral storage at OVH vms? I see may storage related errors
> (see full output below) these days.
> Basically it  means Docker cannot create storage device at local drive
> 
> -- Logs begin at Mon 2015-12-14 06:40:09 UTC, end at Mon 2015-12-14
> 07:00:38 UTC. --
> Dec 14 06:45:50
> 
> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: Stopped
> Docker Application Container Engine.
> Dec 14 06:47:54
> 
> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]:
> Starting Docker Application Container Engine...
> Dec 14 06:48:00
> 
> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]:
> Warning: '-d' is deprecated, it will be removed soon. See usage.
> Dec 14 06:48:00
> 
> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]:
> time="2015-12-14T06:48:00Z" level=warning msg="please use 'docker daemon'
> instead."
> Dec 14 06:48:03
> 
> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]:
> time="2015-12-14T06:48:03.447936206Z" level=info msg="Listening for HTTP
> on unix (/var/run/docker.sock)"
> Dec 14 06:48:06
> 
> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]:
> time="2015-12-14T06:48:06.280086735Z" level=fatal msg="Error starting
> daemon: error initializing graphdriver: Non existing device
> docker-docker--pool"
> Dec 14 06:48:06
> 
> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]:
> docker.service: main process exited, code=exited, status=1/FAILURE
> Dec 14 06:48:06
> 
> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: Failed
> to start Docker Application Container Engine.
> Dec 14 06:48:06
> 
> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: Unit
> docker.service entered failed state.
> Dec 14 06:48:06
> 
> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]:
> docker.service failed.
> 
> 
> http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz
> 
> 
> —
> Egor
> 
> 
> 
> 
> On 12/13/15, 10:51, "Clark Boylan"  wrote:
> 
> >On Sat, Dec 12, 2015, at 02:16 PM, Hongbin Lu wrote:
> >> Hi,
> >> 
> >> As Kai Qiang mentioned, magnum gate recently had a bunch of random
> >> failures, which occurred on creating a nova instance with 2G of RAM.
> >> According to the error message, it seems that the hypervisor tried to
> >> allocate memory to the nova instance but couldn’t find enough free memory
> >> in the host. However, by adding a few “nova hypervisor-show XX” before,
> >> during, and right after the test, it showed that the host has 6G of free
> >> RAM, which is far more than 2G. Here is a snapshot of the output [1]. You
> >> can find the full log here [2].
> >If you look at the dstat log
> 

Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

2015-12-15 Thread Clark Boylan
On Sun, Dec 13, 2015, at 10:51 AM, Clark Boylan wrote:
> On Sat, Dec 12, 2015, at 02:16 PM, Hongbin Lu wrote:
> > Hi,
> > 
> > As Kai Qiang mentioned, magnum gate recently had a bunch of random
> > failures, which occurred on creating a nova instance with 2G of RAM.
> > According to the error message, it seems that the hypervisor tried to
> > allocate memory to the nova instance but couldn’t find enough free memory
> > in the host. However, by adding a few “nova hypervisor-show XX” before,
> > during, and right after the test, it showed that the host has 6G of free
> > RAM, which is far more than 2G. Here is a snapshot of the output [1]. You
> > can find the full log here [2].
> If you look at the dstat log
> http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/screen-dstat.txt.gz
> the host has nowhere near 6GB free memory and less than 2GB. I think you
> actually are just running out of memory.
> > 
> > Another observation is that most of the failure happened on a node with
> > name “devstack-trusty-ovh-*” (You can verify it by entering a query [3]
> > at http://logstash.openstack.org/ ). It seems that the jobs will be fine
> > if they are allocated to a node other than “ovh”.
> I have just done a quick spot check of the total memory on
> devstack-trusty hosts across HPCloud, Rackspace, and OVH using `free -m`
> and the results are 7480, 7732, and 6976 megabytes respectively. Despite
> using 8GB flavors in each case there is variation and OVH comes in on
> the low end for some reason. I am guessing that you fail here more often
> because the other hosts give you just enough extra memory to boot these
> VMs.
To follow up on this we seem to have tracked this down to how the linux
kernel restricts memory at boot when you don't have a contiguous chunk
of system memory. We have worked around this by increasing the memory
restriction to 9023M at boot which gets OVH inline with Rackspace and
slightly increases available memory on HPCloud (because it actually has
more of it).

You should see this fix in action after image builds complete tomorrow
(they start at 1400UTC ish).
> 
> We will have to look into why OVH has less memory despite using flavors
> that should be roughly equivalent.
> > 
> > Any hints to debug this issue further? Suggestions are greatly
> > appreciated.
> > 
> > [1] http://paste.openstack.org/show/481746/
> > [2]
> > http://logs.openstack.org/48/256748/1/check/gate-functional-dsvm-magnum-swarm/56d79c3/console.html
> > [3] https://review.openstack.org/#/c/254370/2/queries/1521237.yaml


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

2015-12-15 Thread Kai Qiang Wu
Thanks Clark and infra guys to work around that.
We would keep track that and see if such issue disappear.



Thanks

Best Wishes,

Kai Qiang Wu (吴开强  Kennan)
IBM China System and Technology Lab, Beijing

E-mail: wk...@cn.ibm.com
Tel: 86-10-82451647
Address: Building 28(Ring Building), ZhongGuanCun Software Park,
 No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China
100193

Follow your heart. You are miracle!



From:   Clark Boylan <cboy...@sapwetik.org>
To: openstack-dev@lists.openstack.org
Date:   16/12/2015 06:42 am
Subject:    Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite
    often for "Cannot set up guest memory 'pc.ram': Cannot allocate
memory"



On Sun, Dec 13, 2015, at 10:51 AM, Clark Boylan wrote:
> On Sat, Dec 12, 2015, at 02:16 PM, Hongbin Lu wrote:
> > Hi,
> >
> > As Kai Qiang mentioned, magnum gate recently had a bunch of random
> > failures, which occurred on creating a nova instance with 2G of RAM.
> > According to the error message, it seems that the hypervisor tried to
> > allocate memory to the nova instance but couldn’t find enough free
memory
> > in the host. However, by adding a few “nova hypervisor-show XX” before,
> > during, and right after the test, it showed that the host has 6G of
free
> > RAM, which is far more than 2G. Here is a snapshot of the output [1].
You
> > can find the full log here [2].
> If you look at the dstat log
>
http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/screen-dstat.txt.gz

> the host has nowhere near 6GB free memory and less than 2GB. I think you
> actually are just running out of memory.
> >
> > Another observation is that most of the failure happened on a node with
> > name “devstack-trusty-ovh-*” (You can verify it by entering a query [3]
> > at http://logstash.openstack.org/ ). It seems that the jobs will be
fine
> > if they are allocated to a node other than “ovh”.
> I have just done a quick spot check of the total memory on
> devstack-trusty hosts across HPCloud, Rackspace, and OVH using `free -m`
> and the results are 7480, 7732, and 6976 megabytes respectively. Despite
> using 8GB flavors in each case there is variation and OVH comes in on
> the low end for some reason. I am guessing that you fail here more often
> because the other hosts give you just enough extra memory to boot these
> VMs.
To follow up on this we seem to have tracked this down to how the linux
kernel restricts memory at boot when you don't have a contiguous chunk
of system memory. We have worked around this by increasing the memory
restriction to 9023M at boot which gets OVH inline with Rackspace and
slightly increases available memory on HPCloud (because it actually has
more of it).

You should see this fix in action after image builds complete tomorrow
(they start at 1400UTC ish).
>
> We will have to look into why OVH has less memory despite using flavors
> that should be roughly equivalent.
> >
> > Any hints to debug this issue further? Suggestions are greatly
> > appreciated.
> >
> > [1] http://paste.openstack.org/show/481746/
> > [2]
> >
http://logs.openstack.org/48/256748/1/check/gate-functional-dsvm-magnum-swarm/56d79c3/console.html

> > [3] https://review.openstack.org/#/c/254370/2/queries/1521237.yaml


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

2015-12-13 Thread Clark Boylan
On Sat, Dec 12, 2015, at 02:16 PM, Hongbin Lu wrote:
> Hi,
> 
> As Kai Qiang mentioned, magnum gate recently had a bunch of random
> failures, which occurred on creating a nova instance with 2G of RAM.
> According to the error message, it seems that the hypervisor tried to
> allocate memory to the nova instance but couldn’t find enough free memory
> in the host. However, by adding a few “nova hypervisor-show XX” before,
> during, and right after the test, it showed that the host has 6G of free
> RAM, which is far more than 2G. Here is a snapshot of the output [1]. You
> can find the full log here [2].
If you look at the dstat log
http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/screen-dstat.txt.gz
the host has nowhere near 6GB free memory and less than 2GB. I think you
actually are just running out of memory.
> 
> Another observation is that most of the failure happened on a node with
> name “devstack-trusty-ovh-*” (You can verify it by entering a query [3]
> at http://logstash.openstack.org/ ). It seems that the jobs will be fine
> if they are allocated to a node other than “ovh”.
I have just done a quick spot check of the total memory on
devstack-trusty hosts across HPCloud, Rackspace, and OVH using `free -m`
and the results are 7480, 7732, and 6976 megabytes respectively. Despite
using 8GB flavors in each case there is variation and OVH comes in on
the low end for some reason. I am guessing that you fail here more often
because the other hosts give you just enough extra memory to boot these
VMs.

We will have to look into why OVH has less memory despite using flavors
that should be roughly equivalent.
> 
> Any hints to debug this issue further? Suggestions are greatly
> appreciated.
> 
> [1] http://paste.openstack.org/show/481746/
> [2]
> http://logs.openstack.org/48/256748/1/check/gate-functional-dsvm-magnum-swarm/56d79c3/console.html
> [3] https://review.openstack.org/#/c/254370/2/queries/1521237.yaml
> 
> Best regards,
> Hongbin

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

2015-12-13 Thread pcrews

Hi,

OVH is a new cloud provider for openstack-infra nodes:
http://www.openstack.org/blog/2015/12/announcing-a-new-cloud-provider-for-openstacks-ci-system-ovh/

It appears that selection of nodes on any cloud provider is a matter of 
luck:
"When a developer uploads a proposed change to an OpenStack project, 
available instances from any of our contributing cloud providers will be 
used interchangeably to test it."


You might want to ping people in #openstack-infra to find a point of 
contact for them (OVH) and/or to work with the infra folks directly to 
see about troubleshooting this further.



On 12/12/2015 02:16 PM, Hongbin Lu wrote:

Hi,

As Kai Qiang mentioned, magnum gate recently had a bunch of random
failures, which occurred on creating a nova instance with 2G of RAM.
According to the error message, it seems that the hypervisor tried to
allocate memory to the nova instance but couldn’t find enough free
memory in the host. However, by adding a few “nova hypervisor-show XX”
before, during, and right after the test, it showed that the host has 6G
of free RAM, which is far more than 2G. Here is a snapshot of the output
[1]. You can find the full log here [2].

Another observation is that most of the failure happened on a node with
name “devstack-trusty-ovh-*” (You can verify it by entering a query [3]
at http://logstash.openstack.org/ ). It seems that the jobs will be fine
if they are allocated to a node other than “ovh”.

Any hints to debug this issue further? Suggestions are greatly appreciated.

<http://logstash.openstack.org/>

[1] <http://logstash.openstack.org/>http://paste.openstack.org/show/481746/

[2]
http://logs.openstack.org/48/256748/1/check/gate-functional-dsvm-magnum-swarm/56d79c3/console.html

[3] https://review.openstack.org/#/c/254370/2/queries/1521237.yaml

Best regards,

Hongbin

*From:*Kai Qiang Wu [mailto:wk...@cn.ibm.com]
*Sent:* December-09-15 7:23 AM
*To:* openstack-dev@lists.openstack.org
*Subject:* [openstack-dev] [Infra][nova][magnum] Jenkins failed quite
often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

Hi All,

I am not sure what changes these days, We found quite often now, the
Jenkins failed for:


http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz

2015-12-09 08:52:27.892
<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_892>+:
22957: debug : qemuMonitorJSONCommandWithFd:264 : Send command
'{"execute":"qmp_capabilities","id":"libvirt-1"}' for write with FD -1
2015-12-09 08:52:27.892
<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_892>+:
22957: debug : qemuMonitorSend:959 : QEMU_MONITOR_SEND_MSG:
mon=0x7fa66400c6f0 msg={"execute":"qmp_capabilities","id":"libvirt-1"}
  fd=-1
2015-12-09 08:52:27.941
<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
22951: debug : virNetlinkEventCallback:347 : dispatching to max 0
clients, called from event watch 6
2015-12-09 08:52:27.941
<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
22951: debug : virNetlinkEventCallback:360 : event not handled.
2015-12-09 08:52:27.941
<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
22951: debug : virNetlinkEventCallback:347 : dispatching to max 0
clients, called from event watch 6
2015-12-09 08:52:27.941
<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
22951: debug : virNetlinkEventCallback:360 : event not handled.
2015-12-09 08:52:27.941
<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
22951: debug : virNetlinkEventCallback:347 : dispatching to max 0
clients, called from event watch 6
2015-12-09 08:52:27.941
<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
22951: debug : virNetlinkEventCallback:360 : event not handled.
2015-12-09 08:52:28.070
<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_28_070>+:
22951: error : qemuMonitorIORead:554 : Unable to read from monitor:
Connection reset by peer
2015-12-09 08:52:28.070
<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt

Re: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

2015-12-12 Thread Hongbin Lu
Hi,

As Kai Qiang mentioned, magnum gate recently had a bunch of random failures, 
which occurred on creating a nova instance with 2G of RAM. According to the 
error message, it seems that the hypervisor tried to allocate memory to the 
nova instance but couldn’t find enough free memory in the host. However, by 
adding a few “nova hypervisor-show XX” before, during, and right after the 
test, it showed that the host has 6G of free RAM, which is far more than 2G. 
Here is a snapshot of the output [1]. You can find the full log here [2].

Another observation is that most of the failure happened on a node with name 
“devstack-trusty-ovh-*” (You can verify it by entering a query [3] at 
http://logstash.openstack.org/ ). It seems that the jobs will be fine if they 
are allocated to a node other than “ovh”.

Any hints to debug this issue further? Suggestions are greatly appreciated.

[1] http://paste.openstack.org/show/481746/
[2] 
http://logs.openstack.org/48/256748/1/check/gate-functional-dsvm-magnum-swarm/56d79c3/console.html
[3] https://review.openstack.org/#/c/254370/2/queries/1521237.yaml

Best regards,
Hongbin

From: Kai Qiang Wu [mailto:wk...@cn.ibm.com]
Sent: December-09-15 7:23 AM
To: openstack-dev@lists.openstack.org
Subject: [openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for 
"Cannot set up guest memory 'pc.ram': Cannot allocate memory"


Hi All,

I am not sure what changes these days, We found quite often now, the Jenkins 
failed for:


http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz

2015-12-09 
08:52:27.892<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_892>+:
 22957: debug : qemuMonitorJSONCommandWithFd:264 : Send command 
'{"execute":"qmp_capabilities","id":"libvirt-1"}' for write with FD -1
2015-12-09 
08:52:27.892<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_892>+:
 22957: debug : qemuMonitorSend:959 : QEMU_MONITOR_SEND_MSG: mon=0x7fa66400c6f0 
msg={"execute":"qmp_capabilities","id":"libvirt-1"}
 fd=-1
2015-12-09 
08:52:27.941<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
 22951: debug : virNetlinkEventCallback:347 : dispatching to max 0 clients, 
called from event watch 6
2015-12-09 
08:52:27.941<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
 22951: debug : virNetlinkEventCallback:360 : event not handled.
2015-12-09 
08:52:27.941<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
 22951: debug : virNetlinkEventCallback:347 : dispatching to max 0 clients, 
called from event watch 6
2015-12-09 
08:52:27.941<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
 22951: debug : virNetlinkEventCallback:360 : event not handled.
2015-12-09 
08:52:27.941<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
 22951: debug : virNetlinkEventCallback:347 : dispatching to max 0 clients, 
called from event watch 6
2015-12-09 
08:52:27.941<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_27_941>+:
 22951: debug : virNetlinkEventCallback:360 : event not handled.
2015-12-09 
08:52:28.070<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_28_070>+:
 22951: error : qemuMonitorIORead:554 : Unable to read from monitor: Connection 
reset by peer
2015-12-09 
08:52:28.070<http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz#_2015-12-09_08_52_28_070>+:
 22951: error : qemuMonitorIO:690 : internal error: early end of file from 
monitor: possible problem:
Cannot set up guest memory 'pc.ram': Cannot allocate memory




It not hit such resource issue before. I am not sure if Infra or nova guys know 
about it ?


Thanks

Best Wishes,

Kai Qiang Wu (吴开强 Kennan)
IBM China System and Technology Lab, Beijing

E-mail: wk...@cn.ibm.com<mailto:wk...@cn.ibm.com>
Tel: 86-10-82451647
Address: Building 28(Ring Building), ZhongGuanCun Software Park,
No.8 Dong Bei Wang West Road, H

[openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"

2015-12-09 Thread Kai Qiang Wu


Hi All,

I am not sure what changes these days, We found quite often now, the
Jenkins failed for:


http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/libvirt/libvirtd.txt.gz

2015-12-09 08:52:27.892+: 22957: debug :
qemuMonitorJSONCommandWithFd:264 : Send command
'{"execute":"qmp_capabilities","id":"libvirt-1"}' for write with FD -1
2015-12-09 08:52:27.892+: 22957: debug : qemuMonitorSend:959 :
QEMU_MONITOR_SEND_MSG: mon=0x7fa66400c6f0 msg=
{"execute":"qmp_capabilities","id":"libvirt-1"}
 fd=-1
2015-12-09 08:52:27.941+: 22951: debug : virNetlinkEventCallback:347 :
dispatching to max 0 clients, called from event watch 6
2015-12-09 08:52:27.941+: 22951: debug : virNetlinkEventCallback:360 :
event not handled.
2015-12-09 08:52:27.941+: 22951: debug : virNetlinkEventCallback:347 :
dispatching to max 0 clients, called from event watch 6
2015-12-09 08:52:27.941+: 22951: debug : virNetlinkEventCallback:360 :
event not handled.
2015-12-09 08:52:27.941+: 22951: debug : virNetlinkEventCallback:347 :
dispatching to max 0 clients, called from event watch 6
2015-12-09 08:52:27.941+: 22951: debug : virNetlinkEventCallback:360 :
event not handled.
2015-12-09 08:52:28.070+: 22951: error : qemuMonitorIORead:554 : Unable
to read from monitor: Connection reset by peer
2015-12-09 08:52:28.070+: 22951: error : qemuMonitorIO:690 : internal
error: early end of file from monitor: possible problem:
Cannot set up guest memory 'pc.ram': Cannot allocate memory




It not hit such resource issue before.  I am not sure if Infra or nova guys
know about it ?


Thanks

Best Wishes,

Kai Qiang Wu (吴开强  Kennan)
IBM China System and Technology Lab, Beijing

E-mail: wk...@cn.ibm.com
Tel: 86-10-82451647
Address: Building 28(Ring Building), ZhongGuanCun Software Park,
 No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China
100193

Follow your heart. You are miracle!
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev