from:"Jay Pipes"

Re: [Openstack] [nova] Database not delete PCI info after device is removed from host and nova.conf

2017-07-06 Thread Jay Pipes

Hmm, very odd indeed. Any way you can save the nova-compute logs from 
when you removed the GPU and restarted the nova-compute service and 
paste those logs to paste.openstack.org? Would be useful in tracking 
down this buggy behaviour...


Best,
-jay

On 07/06/2017 08:54 PM, Eddie Yen wrote:

Hi Jay,

The status of the "removed" GPU still shows as "Available" in 
pci_devices table.


2017-07-07 8:34 GMT+08:00 Jay Pipes <jaypi...@gmail.com 
<mailto:jaypi...@gmail.com>>:


Hi again, Eddie :) Answer inline...

On 07/06/2017 08:14 PM, Eddie Yen wrote:

Hi everyone,

I'm using OpenStack Mitaka version (deployed from Fuel 9.2)

In present, I installed two different model of GPU card.

And wrote these information into pci_alias and
pci_passthrough_whitelist in nova.conf on Controller and Compute
(the node which installed GPU).
Then restart nova-api, nova-scheduler,and nova-compute.

When I check database, both of GPU info registered in
pci_devices table.

Now I removed one of the GPU from compute node, and remove the
information from nova.conf, then restart services.

But I check database again, the information of the removed card
still exist in pci_devices table.

How can I do to fix this problem?


So, when you removed the GPU from the compute node and restarted the
nova-compute service, it *should* have noticed you had removed the
GPU and marked that PCI device as deleted. At least, according to
this code in the PCI manager:

https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L168-L183

<https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L168-L183>

Question for you: what is the value of the status field in the
pci_devices table for the GPU that you removed?

Best,
-jay

p.s. If you really want to get rid of that device, simply remove
that record from the pci_devices table. But, again, it *should* be
removed automatically...

___
Mailing list:
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
Post to : openstack@lists.openstack.org
<mailto:openstack@lists.openstack.org>
Unsubscribe :
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>




___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [openstack][nova] Changes-Since parameter in Nova API not working as expected

2017-06-26 Thread Jay Pipes


On 06/26/2017 12:58 PM, Jose Renato Santos wrote:

Hi

I am accessing the nova api using the gophercloud SDK 
https://github.com/rackspace/gophercloud


I am running Openstack Newton installed with Openstack Ansible

I am accessing the “List Servers” call of the nova Api with the 
Changes-Since parameters for efficient polling


https://developer.openstack.org/api-guide/compute/polling_changes-since_parameter.html

However, the API is not working as I expected.

When I stop or start a server instance, the API successfully detects the 
change in the server state and returns the server in the next call to 
ListServers with the Changes-Since parameter, as expected.


But when I attach a new security group to the server, the API does not 
detect any change in the state of the server and does not return the 
server in the next call  to ListServers with the Changes-Since parameter.


I would expect that changing the list of security groups attached to a 
server would be considered a change in the server state and reported 
when using the Changes-Since parameter, but that is not the behavior 
that I am seeing.


Can someone please let me know if this is a known bug?


Changes to an instance's security group rules are not considered when 
listing servers by updated_at field value. This is mostly because the 
security group [rules] are Neutron objects and are not one-to-one 
associated with a Nova instance.


I'm not sure it's a bug per-se, but I suppose we could entertain a 
feature request to set the updated_at timestamp column for all instances 
associated with a security group when that security group's rules are 
changed.


But that would probably open up a can of worms that Nova developers may 
not be willing to deal with. For instance, should we update the 
instances.update_at table every time a volume is changed? a network port 
that an instance is associated with? A heat stack that launched the 
volume? etc etc.


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [openstack][nova] Changes-Since parameter in Nova API not working as expected

2017-06-26 Thread Jay Pipes


On 06/26/2017 02:27 PM, Jose Renato Santos wrote:

Jay,
Thanks for your response

Let me clarify my point.
I am not expecting to see a change in the updated_at column of a server when 
the rules of its security group changes.
I agree that would be a change to be handled by the Neutron Api, and would be 
too much to ask for Nova to keep track of that
But I would expect to see a change in updated_at column of a server instance 
when I associated(attach) a new security group to that server.
For me that is a change in the server and not on the security group. The 
security group was not changed, but the server was, as it is now associated 
with a different set of security groups
I hope that clarifies my question.


I think that's a pretty reasonable request actually. Care to create a 
bug on Launchpad for it?


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [openstack-dev] nova - nova-manage db sync - Issue

2017-06-14 Thread Jay Pipes

You have installed a really old version of Nova on that server. What are 
you using to install OpenStack?


Best,
-jay

On 06/14/2017 12:13 PM, SGopinath s.gopinath wrote:

Hi,

I'm trying to install Openstack Ocata in
Ubuntu 16.04.2 LTS.

During installation of nova  at this step
su -s /bin/sh -c "nova-manage db sync" nova

I get the following error

An error has occurred:
Traceback (most recent call last):
   File "/usr/lib/python2.7/dist-packages/nova/cmd/manage.py", line 
1594, in main

 ret = fn(*fn_args, **fn_kwargs)
   File "/usr/lib/python2.7/dist-packages/nova/cmd/manage.py", line 644, 
in sync

 return migration.db_sync(version)
   File "/usr/lib/python2.7/dist-packages/nova/db/migration.py", line 
26, in db_sync
 return IMPL.db_sync(version=version, database=database, 
context=context)
   File 
"/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/migration.py", line 
53, in db_sync

 current_version = db_version(database, context=context)
   File 
"/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/migration.py", line 
84, in db_version

 _("Upgrade DB using Essex release first."))
NovaException: Upgrade DB using Essex release first.


Request the help for solving this issue.

Thanks,
S.Gopinath



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [openstack][nova] Changes-Since parameter in Nova API not working as expected

2017-06-27 Thread Jay Pipes


Awesome, thanks Jose!

On 06/26/2017 11:12 PM, Jose Renato Santos wrote:

Jay

I created a bug report as you suggested:
https://bugs.launchpad.net/nova/+bug/1700684

Thanks for your help
Best
Renato

-Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Monday, June 26, 2017 2:32 PM
To: Jose Renato Santos <santos.joseren...@gmail.com>; 
openstack@lists.openstack.org
Subject: Re: [Openstack] [openstack][nova] Changes-Since parameter in Nova API 
not working as expected

On 06/26/2017 02:27 PM, Jose Renato Santos wrote:

Jay,
Thanks for your response

Let me clarify my point.
I am not expecting to see a change in the updated_at column of a server when 
the rules of its security group changes.
I agree that would be a change to be handled by the Neutron Api, and
would be too much to ask for Nova to keep track of that But I would expect to 
see a change in updated_at column of a server instance when I 
associated(attach) a new security group to that server.
For me that is a change in the server and not on the security group.
The security group was not changed, but the server was, as it is now associated 
with a different set of security groups I hope that clarifies my question.


I think that's a pretty reasonable request actually. Care to create a bug on 
Launchpad for it?

Best,
-jay



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Traits is not working

2017-10-06 Thread Jay Pipes


On 10/06/2017 10:18 AM, Ramu, MohanX wrote:

Hi Jay,

I am able to create custom traits without any issue. Want to associate some 
value to that traits.


Like I mentioned in the previous email, that's not how traits work :)

A trait *is* the value that is associated with a resource provider.

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Traits is not working

2017-10-04 Thread Jay Pipes


Rock on :)

On 10/04/2017 09:33 AM, Ramu, MohanX wrote:

Thank you so much Jay. After adding this header, working fine.

-Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Tuesday, October 3, 2017 11:36 PM
To: openstack@lists.openstack.org
Subject: Re: [Openstack] Traits is not working

Against the Pike placement API endpoint, make sure you send the following HTTP 
header:

OpenStack-API-Version: placement 1.10

Best,
-jay

On 10/03/2017 02:01 PM, Ramu, MohanX wrote:

Please refer attached original one.


-Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Tuesday, October 3, 2017 10:03 PM
To: Ramu, MohanX <mohanx.r...@intel.com>;
openstack@lists.openstack.org
Subject: Re: [Openstack] Traits is not working

On 10/03/2017 12:12 PM, Ramu, MohanX wrote:

Thanks for reply Jay.

No Jay,

I have installed Pike. There also I face the same problem.


No, you haven't installed Pike (or at least not properly). Otherwise, the 
max_version returned from the Pike placement API would be 1.10, not 1.4.

Best,
-jay


Thanks & Regards,

Mohan Ramu
-Original Message-----
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Tuesday, October 3, 2017 9:26 PM
To: openstack@lists.openstack.org
Subject: Re: [Openstack] Traits is not working

On 10/03/2017 11:34 AM, Ramu, MohanX wrote:

Hi,

We have implemented OpenStack Ocata and Pike releases, able to
consume Placement resource providers API, not able to consume resource class 
APIs’.

I tried to run Triats API in Pike set up too. I am not able to run
any Traits API.

As per the Open Stack doc, the Placement API URL is a base URL for
Traits also. I am able to run Placement API as per the given doc,
not able to run/access the Traits APIs’ . Getting 404 (Not Found error).


The /traits REST endpoint is part of the Placement API, yes.


As mentioned in below link, the placement-manage os-traits
sync/command is not working, it says that command not found.


This means you have not installed (or updated) packages for Pike.


https://specs.openstack.org/openstack/nova-specs/specs/pike/approved
/
r
esource-provider-traits.html

Pike – Placement API version is 1.0 to 1.10

Ocata – Placement API version is 1.0 to 1.4 which support

We got  404 only, It seems there is a disconnect btw Placement and
Triats. Need to understand that are we missing any configuration.


You do not have Pike installed. You have Ocata installed. You need to upgrade 
to Pike.

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe :
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Traits is not working

2017-10-03 Thread Jay Pipes

Against the Pike placement API endpoint, make sure you send the 
following HTTP header:


OpenStack-API-Version: placement 1.10

Best,
-jay

On 10/03/2017 02:01 PM, Ramu, MohanX wrote:

Please refer attached original one.


-Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Tuesday, October 3, 2017 10:03 PM
To: Ramu, MohanX <mohanx.r...@intel.com>; openstack@lists.openstack.org
Subject: Re: [Openstack] Traits is not working

On 10/03/2017 12:12 PM, Ramu, MohanX wrote:

Thanks for reply Jay.

No Jay,

I have installed Pike. There also I face the same problem.


No, you haven't installed Pike (or at least not properly). Otherwise, the 
max_version returned from the Pike placement API would be 1.10, not 1.4.

Best,
-jay


Thanks & Regards,

Mohan Ramu
-Original Message-----
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Tuesday, October 3, 2017 9:26 PM
To: openstack@lists.openstack.org
Subject: Re: [Openstack] Traits is not working

On 10/03/2017 11:34 AM, Ramu, MohanX wrote:

Hi,

We have implemented OpenStack Ocata and Pike releases, able to
consume Placement resource providers API, not able to consume resource class 
APIs’.

I tried to run Triats API in Pike set up too. I am not able to run
any Traits API.

As per the Open Stack doc, the Placement API URL is a base URL for
Traits also. I am able to run Placement API as per the given doc, not
able to run/access the Traits APIs’ . Getting 404 (Not Found error).


The /traits REST endpoint is part of the Placement API, yes.


As mentioned in below link, the placement-manage os-traits
sync/command is not working, it says that command not found.


This means you have not installed (or updated) packages for Pike.


https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/
r
esource-provider-traits.html

Pike – Placement API version is 1.0 to 1.10

Ocata – Placement API version is 1.0 to 1.4 which support

We got  404 only, It seems there is a disconnect btw Placement and
Triats. Need to understand that are we missing any configuration.


You do not have Pike installed. You have Ocata installed. You need to upgrade 
to Pike.

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe :
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Traits is not working

2017-10-05 Thread Jay Pipes


On 10/05/2017 05:26 AM, Ramu, MohanX wrote:

Hi Jay,

I want to create the Custom traits with the property value attached to it.

FOR Ex:

I want to have CUSTOM_xyz traits with "status": "true".


That's not a trait :) That's a status indicator.

Traits are simple string tags that represent a single-valued thing or 
capability. A status is a multi-valued field.



When the CUSTOM_xyz traits is associated to a resource provider  I should be 
able to see the status value is true or not.


A trait is either associated with a resource provider or it isn't. When 
you do a call to `GET /resource_providers/{rp_uuid}/traits` what is 
returned is a list of the traits the resource provider with UUID 
{rp_uuid} has associated with it.



I referred below link to create custom traits , not able to create.

https://specs.openstack.org/openstack/nova-specs/specs/pike/implemented/resource-provider-traits.html

PUT /resource_providers/{uuid}/traits

This API is to associate traits with specified resource provider. All the 
associated traits will be replaced by the traits specified in the request body. 
Nova-compute will report the compute node traits through this API.

The body of the request must match the following JSONSchema document:

{
 "type": "object",
 "properties": {
 "traits": {
 "type": "array",
 "items": CUSTOM_TRAIT
 },
 "resource_provider_generation": {
 "type": "integer"
 }
 },
 'required': ['traits', 'resource_provider_generation'],
 'additionalProperties': False
}


I suspect the issue you're having is that you need to create the custom 
trait first and *then* associate that trait with one or more resource 
providers.


To create the trait, do:

PUT /traits/CUSTOM_XYZ

and then associate it to a resource provider by doing:

PUT /resource_provider/{rp_uuid}/traits
{
  "resource_provider_generation": 1,
  "traits": [
 "CUSTOM_XYZ"
  ]
}

BTW, a great place to see examples of both good and bad API usage is to 
check out the Gabbit functional API tests for the placement API. Here is 
the set of tests for the traits functionality:


https://github.com/openstack/nova/blob/master/nova/tests/functional/api/openstack/placement/gabbits/traits.yaml

Best,
-jay



Thanks & Regards,

Mohan Ramu
-Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Wednesday, October 4, 2017 7:06 PM
To: Ramu, MohanX <mohanx.r...@intel.com>; openstack@lists.openstack.org
Subject: Re: [Openstack] Traits is not working

Rock on :)

On 10/04/2017 09:33 AM, Ramu, MohanX wrote:

Thank you so much Jay. After adding this header, working fine.

-Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Tuesday, October 3, 2017 11:36 PM
To: openstack@lists.openstack.org
Subject: Re: [Openstack] Traits is not working

Against the Pike placement API endpoint, make sure you send the following HTTP 
header:

OpenStack-API-Version: placement 1.10

Best,
-jay

On 10/03/2017 02:01 PM, Ramu, MohanX wrote:

Please refer attached original one.


-Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Tuesday, October 3, 2017 10:03 PM
To: Ramu, MohanX <mohanx.r...@intel.com>;
openstack@lists.openstack.org
Subject: Re: [Openstack] Traits is not working

On 10/03/2017 12:12 PM, Ramu, MohanX wrote:

Thanks for reply Jay.

No Jay,

I have installed Pike. There also I face the same problem.


No, you haven't installed Pike (or at least not properly). Otherwise, the 
max_version returned from the Pike placement API would be 1.10, not 1.4.

Best,
-jay


Thanks & Regards,

Mohan Ramu
-Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Tuesday, October 3, 2017 9:26 PM
To: openstack@lists.openstack.org
Subject: Re: [Openstack] Traits is not working

On 10/03/2017 11:34 AM, Ramu, MohanX wrote:

Hi,

We have implemented OpenStack Ocata and Pike releases, able to
consume Placement resource providers API, not able to consume resource class 
APIs’.

I tried to run Triats API in Pike set up too. I am not able to run
any Traits API.

As per the Open Stack doc, the Placement API URL is a base URL for
Traits also. I am able to run Placement API as per the given doc,
not able to run/access the Traits APIs’ . Getting 404 (Not Found error).


The /traits REST endpoint is part of the Placement API, yes.


As mentioned in below link, the placement-manage os-traits
sync/command is not working, it says that command not found.


This means you have not installed (or updated) packages for Pike.


https://specs.openstack.org/openstack/nova-specs/specs/pike/approve
d
/
r
esource-provider-traits.html

Pike – Placement API version is 1.0 to 1.10

Ocata – Placement API version is 1.0 to 1.4 which support

We got  404 only, It seems ther

Re: [Openstack] [OpenStack] Can Mitaka RamFilter support free hugepages?

2017-09-06 Thread Jay Pipes

On 09/06/2017 01:21 AM, Weichih Lu wrote:

Thanks for your response.

Is this mean if I want to create an instance with flavor: 16G memory
(hw:mem_page_size=large), I need to preserve memory more than 16GB ?

This instance consume hugepages resource.

You need to reserve fewer 1GB huge pages than 50 if you want to launch a
16GB instance on a host with 64GB of RAM. Try reserving 32 1GB huge pages.

Best,
-jay

2017-09-06 1:47 GMT+08:00 Jay Pipes <jaypi...@gmail.com
<mailto:jaypi...@gmail.com>>:

Please remember to add a topic [nova] marker to your subject line.
Answer below.

On 09/05/2017 04:45 AM, Weichih Lu wrote:

Dear all,

I have a compute node with 64GB ram. And I set 50 hugepages wiht
1GB hugepage size. I used command "free", it shows free memory
is about 12GB. And free hugepages is 50.

Correct. By assigning hugepages, you use the memory allocated to the
hugepages.

Then I launch an instance with 16GB memory, set flavor tag :
hw:mem_page_size=large. It show Error: No valid host was found.
There are not enough hosts available.

Right, because you have only 12G of RAM available after
creating/allocating 50G out of your 64G.

Huge pages are entirely separate from the normal memory that a
flavor consumes. The 16GB memory in your flavor is RAM consumed on
the host. The huge pages are individual things that are consumed by
the NUMA topology that your instance will take. RAM != huge pages.
Totally different things.

And I check nova-scheduler log. My

compute is removed by RamFilter. I can launch an instance with
8GB memory successfully, or I can launch an instance with 16GB
memory sucessfully by remove RamFilter.

That's because RamFilter doesn't deal with huge pages. Because huge
pages are a different resource than memory. The page itself is the
resource.

The NUMATopologyFilter is the scheduler filter that evaluates the
huge page resources on a compute host and determines if the there
are enough *pages* available for the instance. Note that I say
*pages* because the unit of resource consumption for huge pages is
not MB of RAM. It's a single memory page.

Please read this excellent article by Steve Gordon for information
on what NUMA and huge pages are and how to use them in Nova:

http://redhatstackblog.redhat.com/2015/09/15/driving-in-the-fast-lane-huge-page-support-in-openstack-compute/

<http://redhatstackblog.redhat.com/2015/09/15/driving-in-the-fast-lane-huge-page-support-in-openstack-compute/>

Best,
-jay

Does RamFilter only check free memory but not free hugepages?
How can I solve this problem?

I use openstack mitaka version.

thanks

WeiChih, Lu.

Best Regards.

___
Mailing list:
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
Post to : openstack@lists.openstack.org
<mailto:openstack@lists.openstack.org>
Unsubscribe :
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [OpenStack] Can Mitaka RamFilter support free hugepages?

2017-09-06 Thread Jay Pipes

Sahid, Stephen, what are your thoughts on this?

On 09/06/2017 10:17 PM, Yaguang Tang wrote:
I think the fact that RamFilter can't deal with huge pages is a bug ,
duo to this limit, we have to set a balance between normal memory and
huge pages to use RamFilter and NUMATopologyFilter. what do you think Jay?

On Wed, Sep 6, 2017 at 9:22 PM, Jay Pipes <jaypi...@gmail.com
<mailto:jaypi...@gmail.com>> wrote:

On 09/06/2017 01:21 AM, Weichih Lu wrote:

Thanks for your response.

Is this mean if I want to create an instance with flavor: 16G
memory (hw:mem_page_size=large), I need to preserve memory more
than 16GB ?
This instance consume hugepages resource.

You need to reserve fewer 1GB huge pages than 50 if you want to
launch a 16GB instance on a host with 64GB of RAM. Try reserving 32
1GB huge pages.

Best,
-jay

2017-09-06 1:47 GMT+08:00 Jay Pipes <jaypi...@gmail.com
<mailto:jaypi...@gmail.com> <mailto:jaypi...@gmail.com
<mailto:jaypi...@gmail.com>>>:

Please remember to add a topic [nova] marker to your
subject line.
Answer below.

On 09/05/2017 04:45 AM, Weichih Lu wrote:

Dear all,

I have a compute node with 64GB ram. And I set 50
hugepages wiht
1GB hugepage size. I used command "free", it shows free
memory
is about 12GB. And free hugepages is 50.

Correct. By assigning hugepages, you use the memory
allocated to the
hugepages.

Then I launch an instance with 16GB memory, set flavor
tag :
hw:mem_page_size=large. It show Error: No valid host
was found.
There are not enough hosts available.

Right, because you have only 12G of RAM available after
creating/allocating 50G out of your 64G.

Huge pages are entirely separate from the normal memory that a
flavor consumes. The 16GB memory in your flavor is RAM
consumed on
the host. The huge pages are individual things that are
consumed by
the NUMA topology that your instance will take. RAM != huge
pages.
Totally different things.

And I check nova-scheduler log. My

compute is removed by RamFilter. I can launch an
instance with
8GB memory successfully, or I can launch an instance
with 16GB
memory sucessfully by remove RamFilter.

That's because RamFilter doesn't deal with huge pages.
Because huge
pages are a different resource than memory. The page itself
is the
resource.

The NUMATopologyFilter is the scheduler filter that
evaluates the
huge page resources on a compute host and determines if the
there
are enough *pages* available for the instance. Note that I say
*pages* because the unit of resource consumption for huge
pages is
not MB of RAM. It's a single memory page.

Please read this excellent article by Steve Gordon for
information
on what NUMA and huge pages are and how to use them in Nova:

http://redhatstackblog.redhat.com/2015/09/15/driving-in-the-fast-lane-huge-page-support-in-openstack-compute/

<http://redhatstackblog.redhat.com/2015/09/15/driving-in-the-fast-lane-huge-page-support-in-openstack-compute/>

<http://redhatstackblog.redhat.com/2015/09/15/driving-in-the-fast-lane-huge-page-support-in-openstack-compute/

<http://redhatstackblog.redhat.com/2015/09/15/driving-in-the-fast-lane-huge-page-support-in-openstack-compute/>>

Best,
-jay

Does RamFilter only check free memory but not free
hugepages?
How can I solve this problem?

I use openstack mitaka version.

thanks

WeiChih, Lu.

Best Regards.

___
Mailing list:
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>

<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>>
Post to : openstack@lists.openstack.org
<mailto:openstack@lists.openstack.org>
<mailto:openstack@lists.openstack.org
<mailto:openstack@lists.openstack.org>>
Unsubscribe :
http://lists.openstack.org/cgi-b

Re: [Openstack] [OpenStack] Can Mitaka RamFilter support free hugepages?

2017-09-05 Thread Jay Pipes

Please remember to add a topic [nova] marker to your subject line.
Answer below.

On 09/05/2017 04:45 AM, Weichih Lu wrote:

Dear all,

I have a compute node with 64GB ram. And I set 50 hugepages wiht 1GB
hugepage size. I used command "free", it shows free memory is about
12GB. And free hugepages is 50.

Correct. By assigning hugepages, you use the memory allocated to the
hugepages.

Then I launch an instance with 16GB memory, set flavor tag
: hw:mem_page_size=large. It show Error: No valid host was found. There
are not enough hosts available.

Right, because you have only 12G of RAM available after
creating/allocating 50G out of your 64G.

Huge pages are entirely separate from the normal memory that a flavor
consumes. The 16GB memory in your flavor is RAM consumed on the host.
The huge pages are individual things that are consumed by the NUMA
topology that your instance will take. RAM != huge pages. Totally
different things.

And I check nova-scheduler log. My
compute is removed by RamFilter. I can launch an instance with 8GB
memory successfully, or I can launch an instance with 16GB memory
sucessfully by remove RamFilter.

That's because RamFilter doesn't deal with huge pages. Because huge
pages are a different resource than memory. The page itself is the resource.

The NUMATopologyFilter is the scheduler filter that evaluates the huge
page resources on a compute host and determines if the there are enough
*pages* available for the instance. Note that I say *pages* because the
unit of resource consumption for huge pages is not MB of RAM. It's a
single memory page.

Please read this excellent article by Steve Gordon for information on
what NUMA and huge pages are and how to use them in Nova:

http://redhatstackblog.redhat.com/2015/09/15/driving-in-the-fast-lane-huge-page-support-in-openstack-compute/

Best,
-jay

Does RamFilter only check free memory but not free hugepages?
How can I solve this problem?

I use openstack mitaka version.

thanks

WeiChih, Lu.

Best Regards.

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] extend attached volumes

2017-09-26 Thread Jay Pipes


Detach the volume, then resize it, then re-attach.

Best,
-jay

On 09/26/2017 09:22 AM, Volodymyr Litovka wrote:

Colleagues,

can't find ways to resize attached volume. I'm on Pike.

As far as I understand, it required to be supported in Nova, because 
Cinder need to check with Nova whether it's possible to extend this volume.


Well,
- Nova's API microversion is 2.51, which seems to be enough to support 
"volume-extended" API call
- Properties of image are *hw_disk_bus='scsi'* and 
*hw_scsi_model='virtio-scsi'*, type bare/raw, located in Cinder

- hypervisor is KVM
- volume is bootable, mounted as root, created as snapshot from Cinder 
volume

- Cinder's backend is CEPH/Bluestore

and both "cinder extend" and "openstack volume set --size" returns 
"Volume status must be '{'status': 'available'}' to extend, currently 
in-use".


I did not find any configuration options neither in nova nor in cinder 
config files, which can help with this functionality.


What I'm doing wrong?

Thank you.

--
Volodymyr Litovka
   "Vision without Execution is Hallucination." -- Thomas Edison



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] extend attached volumes

2017-09-26 Thread Jay Pipes


On 09/26/2017 10:20 AM, Volodymyr Litovka wrote:

Hi Jay,

I know about this way :-) but Pike introduced ability to resize attached 
volumes:


"It is now possible to signal and perform an online volume size change 
as of the 2.51 microversion using the|volume-extended|external event. 
Nova will perform the volume extension so the host can detect its new 
size. It will also resize the device in QEMU so instance can detect the 
new disk size without rebooting." -- 
https://docs.openstack.org/releasenotes/nova/pike.html


Apologies, Volodymyr, I wasn't aware of that ability!

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Traits is not working

2017-10-03 Thread Jay Pipes


On 10/03/2017 12:12 PM, Ramu, MohanX wrote:

Thanks for reply Jay.

No Jay,

I have installed Pike. There also I face the same problem.


No, you haven't installed Pike (or at least not properly). Otherwise, 
the max_version returned from the Pike placement API would be 1.10, not 1.4.


Best,
-jay


Thanks & Regards,

Mohan Ramu
-Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Tuesday, October 3, 2017 9:26 PM
To: openstack@lists.openstack.org
Subject: Re: [Openstack] Traits is not working

On 10/03/2017 11:34 AM, Ramu, MohanX wrote:

Hi,

We have implemented OpenStack Ocata and Pike releases, able to consume
Placement resource providers API, not able to consume resource class APIs’.

I tried to run Triats API in Pike set up too. I am not able to run any
Traits API.

As per the Open Stack doc, the Placement API URL is a base URL for
Traits also. I am able to run Placement API as per the given doc, not
able to run/access the Traits APIs’ . Getting 404 (Not Found error).


The /traits REST endpoint is part of the Placement API, yes.


As mentioned in below link, the placement-manage os-traits
sync/command is not working, it says that command not found.


This means you have not installed (or updated) packages for Pike.


https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/r
esource-provider-traits.html

Pike – Placement API version is 1.0 to 1.10

Ocata – Placement API version is 1.0 to 1.4 which support

We got  404 only, It seems there is a disconnect btw Placement and
Triats. Need to understand that are we missing any configuration.


You do not have Pike installed. You have Ocata installed. You need to upgrade 
to Pike.

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Traits is not working

2017-10-03 Thread Jay Pipes


On 10/03/2017 11:34 AM, Ramu, MohanX wrote:

Hi,

We have implemented OpenStack Ocata and Pike releases, able to consume 
Placement resource providers API, not able to consume resource class APIs’.


I tried to run Triats API in Pike set up too. I am not able to run any 
Traits API.


As per the Open Stack doc, the Placement API URL is a base URL for 
Traits also. I am able to run Placement API as per the given doc, not 
able to run/access the Traits APIs’ . Getting 404 (Not Found error).


The /traits REST endpoint is part of the Placement API, yes.

As mentioned in below link, the placement-manage os-traits sync/command 
is not working, it says that command not found.


This means you have not installed (or updated) packages for Pike.


https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/resource-provider-traits.html

Pike – Placement API version is 1.0 to 1.10

Ocata – Placement API version is 1.0 to 1.4 which support

We got  404 only, It seems there is a disconnect btw Placement and 
Triats. Need to understand that are we missing any configuration.


You do not have Pike installed. You have Ocata installed. You need to 
upgrade to Pike.


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Ocata The placement API endpoint not found on Ubuntu

2017-08-21 Thread Jay Pipes


On 08/18/2017 08:50 AM, Divneet Singh wrote:
Hello, I have trying to install ocata on Ubuntu 16.04 , for the time 
being i have 2 nodes . just can't figure this out.


I have setup Placement API. But get error after restart nova service or 
reboot


" 017-08-18 08:27:41.496 1422 WARNING nova.scheduler.client.report 
[req-17911703-827e-402d-85e8-a0bb25003fe3 - - - - -] The placement API 
endpoint not found. Placement is optional in Newton, but required in 
Ocata. Please enable the placement service before upgrading.  "


And on the controller node when I run the command .
openstack@controller:~$ sudo nova-status  upgrade check
+---+
| Upgrade Check Results   |
+---+
| Check: Cells v2|
| Result: Success |
| Details: None|
+---+
| Check: Placement API|
| Result: Failure |
| Details: Placement API endpoint not found.|
+---+
| Check: Resource Providers |
| Result: Warning |
| Details: There are no compute resource providers in the Placement |
|   service but there are 1 compute nodes in the deployment.|
|   This means no compute nodes are reporting into the  |
|   Placement service and need to be upgraded and/or fixed. |
|   See |
| http://docs.openstack.org/developer/nova/placement.html 
 |

|   for more details.

I followed the the ocata guide given in the documentation by the letter .

After a feedback i got , just to make sure placement service configured 
in the service catalog:

$  openstack catalog show placement
+---++
| Field | Value  |
+---++
| endpoints | RegionOne  |
|   |   admin: http://controller:8778|
|   | RegionOne  |
|   |   public: http://controller:8778   |
|   | RegionOne  |
|   |   internal: http://controller:8778 |
|   ||
| id| 825f1a56d9a4438d9f54d893a7b227c0   |
| name  | placement  |
| type  | placement  |
+---++

$ export TOKEN=$(openstack token issue -f value -c id)
$ curl -H "x-auth-token: $TOKEN" $PLACEMENT
{"versions": [{"min_version": "1.0", "max_version": "1.4", "id": "v1.0"}]}

I think this means that Placement service is configured correctly .

Do i need to configure a web server on the compute node  ?


No, you definitely do not need to configure a web server on the compute 
node.


My guess is that the [keystone_authtoken] section of your nova.conf file 
on either or both of the controller and compute nodes is not correct or 
doesn't match what you have in your rc file for the openstack client.


The nova-status command and the service daemons in Nova do not get their 
connection information from the rc file that the openstack client uses. 
Instead, they look in the [keystone_authtoken] section of the nova.conf 
files.


So, make sure that your [keystone_authtoken] section of nova.conf files 
contain proper information according to this documentation:


https://docs.openstack.org/ocata/config-reference/compute/nova-conf-samples.html

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Traits for filter

2017-11-16 Thread Jay Pipes

On 11/16/2017 12:06 AM, Ramu, MohanX wrote:

Hi All,

I have a use case that I need to apply some filter (Custom traits)
while Placement API fetch the resource providers for launching instance.

So that I can have list of resource provided which meets my
condition/filter/validation. The validation is nothing but trust about
the Host(compute node) where I am going to launch the instances.

The below link says that it is possible, don’t have idea how to
implement/test this scenario.

https://specs.openstack.org/openstack/nova-specs/specs/ocata/implemented/resource-providers-scheduler-db-filters.html

we would rather make a HTTP call to the placement API on a specific REST
resource with a request that would return the list of resource
providers’ UUIDs that would match requested resources and traits
criterias based on the original RequestSpec object.

Unfortunately, you're going to need to wait for this to be possible with
the placement API. We're making progress here, but it's not complete yet.

You won't be using a custom filter (or any filter at all in the
nova-scheduler). Rather, you'll simply have the required trait in the
image or flavor and nova-scheduler will ask placement API for all
providers that have the required traits and requested resource amounts.

We're probably 3-4 weeks away from having this code merged.

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Traits for filter

2017-11-17 Thread Jay Pipes


On 11/17/2017 01:09 AM, Ramu, MohanX wrote:

Thank you Jay.

I am trying to understand usage of custom traits is that

I have a custom trait called "CUSTOM_ABC" which is associated with Resource Provider "Resource 
provider-1" , So I can launch the instance which is having flavor/image associated with same custom 
traits (CUSTOM_ABC) only on the Resource Provider "Resource provider-1" .


As mentioned in my response below, we're currently working on adding 
this functionality to Nova for the Queens release. The work is in this 
patch series:


https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:validate_provider_summaries

You will need to wait for the Queens release for the complete 
traits-based scheduling functionality to be operational.


Best,
-jay


-----Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Thursday, November 16, 2017 9:10 PM
To: Ramu, MohanX <mohanx.r...@intel.com>
Cc: openstack@lists.openstack.org
Subject: Re: Traits for filter

On 11/16/2017 12:06 AM, Ramu, MohanX wrote:

Hi All,

I have a use case that I  need to apply some filter (Custom traits)
while Placement API fetch the resource providers for launching instance.

So that I can have list of resource provided which meets my
condition/filter/validation. The validation is nothing but trust about
the Host(compute node) where I am going to launch the instances.

The below link says that it is possible, don't have idea how to
implement/test this scenario.

https://specs.openstack.org/openstack/nova-specs/specs/ocata/implement
ed/resource-providers-scheduler-db-filters.html

we would rather make a HTTP call to the placement API on a specific
REST resource with a request that would return the list of resource
providers' UUIDs that would match requested resources and traits
criterias based on the original RequestSpec object.


Unfortunately, you're going to need to wait for this to be possible with the 
placement API. We're making progress here, but it's not complete yet.

You won't be using a custom filter (or any filter at all in the 
nova-scheduler). Rather, you'll simply have the required trait in the image or 
flavor and nova-scheduler will ask placement API for all providers that have 
the required traits and requested resource amounts.

We're probably 3-4 weeks away from having this code merged.

Best,
-jay



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Flavor metadata quota doesn't work

2017-12-01 Thread Jay Pipes


On 12/01/2017 08:57 AM, si...@turka.nl wrote:

Hi,

I have created a flavor with the following metadata:
quota:disk_write_bytes_sec='10240'

This should limit writing to disk to 10240 bytes (10KB/s). I also tried it
with a higher number (100MB/s).

Using the flavor I have launched an instance and ran a write speed test.

For an unknown reason, the metadata seems to be ingored, since I can write
with 500+ MB/s to the disk:

[centos@vmthresholdtest ~]$ dd if=/dev/zero of=file.bin bs=100M count=15
conv=fdatasync
15+0 records in
15+0 records out
1572864000 bytes (1,6 GB) copied, 2,78904 s, 564 MB/s
[centos@vmthresholdtest ~]$

Running Newton.


Yeah, that functionality doesn't work. Really, not sure if it ever did.

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Compute node on bare metal?

2018-07-02 Thread Jay Pipes


On 07/02/2018 09:45 AM, Houssam ElBouanani wrote:

Hi,

I have recently finished installing a minimal OpenStack Queens 
environment for a school project, and was asked whether it is possible 
to deploy an additional compute node on bare metal, aka without an 
underlying operating system, in order to eliminate the operating system 
overhead and thus to maximize performance.


Whomever asked you about this must be confusing a *hypervisor* with an 
operating system. Using baremetal means you eliminate the overhead of 
the *hypervisor* (virtualization). It doesn't mean you eliminate the 
operating system. You can't do much of anything with a baremetal machine 
that has no operating system on it.


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] HA Compute & Instance Evacuation

2018-05-02 Thread Jay Pipes


On 05/02/2018 02:43 PM, Torin Woltjer wrote:

I am working on setting up Openstack for HA and one of the last orders of
business is getting HA behavior out of the compute nodes.


There is no HA behaviour for compute nodes.


Is there a project that will automatically evacuate instances from a
downed or failed compute host, and automatically reboot them on their
new host?

Check out Masakari:

https://wiki.openstack.org/wiki/Masakari


I'm curious what suggestions people have about this, or whatever
advice you might have. Is there a best way of getting this
functionality, or anything else I should be aware of?


You are referring to HA of workloads running on compute nodes, not HA of 
compute nodes themselves.


My advice would be to install Kubernetes on one or more VMs (with the 
VMs acting as Kubernetes nodes) and use that project's excellent 
orchestrator for daemonsets/statefulsets which is essentially the use 
case you are describing.


The OpenStack Compute API (implemented in Nova) is not an orchestration 
API. It's a low-level infrastructure API for executing basic actions on 
compute resources.


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [masakari] HA Compute & Instance Evacuation

2018-05-02 Thread Jay Pipes

On 05/02/2018 04:39 PM, Torin Woltjer wrote:

 > There is no HA behaviour for compute nodes.
 >
 > You are referring to HA of workloads running on compute nodes, not HA of
 > compute nodes themselves.
It was a mistake for me to say HA when referring to compute and 
instances. Really I want to avoid a situation where one of my compute 
hosts gives up the ghost, and all of the instances are offline until 
someone reboots them on a different host. I would like them to 
automatically reboot on a healthy compute node.

 > Check out Masakari:
 >
 > https://wiki.openstack.org/wiki/Masakari
This looks like the kind of thing I'm searching for.

I'm seeing 3 components here, I'm assuming one goes on compute hosts and 
one or both of the others go on the control nodes?

I don't believe anything goes on the compute nodes, no. I'm pretty sure 
the Masakari API service and engine workers live on controller nodes.

Is there any documentation outlining the procedure for deploying
this? Will there be any problem running the Masakari API service on 2
machines simultaneously, sitting behind HAProxy?
Not sure. I'll leave it up to the Masakari developers to help out here. 
I've added [masakari] topic to the subject line.

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Meaning of each field of 'hypervisor stats show' command.

2018-01-17 Thread Jay Pipes


On 01/17/2018 12:46 PM, Jorge Luiz Correa wrote:
Hi, I would like some help to understand what does means each field in 
output of the command 'openstack hypervisor stats show':


it's an amalgamation of legacy information that IMHO should be 
deprecated from the Compute API.


FWIW, the "implementation" for this API response is basically just a 
single SQL statement issued against each Nova cell DB:


https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L755


$ openstack hypervisor stats show
+--+-+
| Field| Value   |
+--+-+
| count| 5   |


number of hypervisor hosts in the system that are not disabled.


| current_workload | 0   |


The SUM of active boot/reboot/migrate/resize operations going on for all 
the hypervisor hosts.


What actions represent "workload"? See here:

https://github.com/openstack/nova/blob/master/nova/compute/stats.py#L45


| disk_available_least | 1848|


who knows? it's dependent on the virt driver and the disk image backing 
file and about as reliable as a one-armed guitar player.



| free_disk_gb | 1705|


theoretically should be sum(local_gb - local_gb_used) for all hypervisor 
hosts.



| free_ram_mb  | 2415293 |


theoretically should be sum(memory_mb - memory_mb_used) for all 
hypervisor hosts.



| local_gb | 2055|


amount of space, in GB, available for ephemeral disk images on the 
hypervisor hosts. if shared storage is used, this value is as useful as 
having two left feet.



| local_gb_used| 350 |


the amount of storage used for ephemeral disk images of instances on the 
hypervisor hosts. if the instances are boot-from-volume, this number is 
about as valuable as a three-dollar bill.



| memory_mb| 2579645 |


the total amount of RAM the hypervisor hosts have. this does not take 
into account the amount of reserved memory the host might have configured.



| memory_mb_used   | 164352  |


the total amount of memory allocated to guest VMs on the hypervisor hosts.


| running_vms  | 13  |


the total number of VMs on all the hypervisor hosts that are NOT in the 
DELETED or SHELVED_OFFLOADED states.


https://github.com/openstack/nova/blob/master/nova/compute/vm_states.py#L78


| vcpus| 320 |


total amount of physical CPU core-threads across all hypervisor hosts.


| vcpus_used   | 75  |
+--+-+


total number of vCPUs allocated to guests (regardless of VM state) 
across the hypervisor hosts.


Best,
-jay



Anyone could indicate the documentation that explain each one? Some of 
them is clear but others are not.


Thanks!

- JLC


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Production deployment mirantis vs tripleO

2018-01-15 Thread Jay Pipes


On 01/15/2018 12:58 PM, Satish Patel wrote:

But Fuel is active project, isn't it?

https://docs.openstack.org/fuel-docs/latest/


No, it is no longer developed or supported.

-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [openstack] [ironic] Does Ironic support that different nova-compute map to different ironic endpoint?

2018-01-02 Thread Jay Pipes


On 01/02/2018 06:09 AM, Guo James wrote:

Hi guys
I know that Ironic has support multi-nova-compute.
But I am not sure whether OpenStack support the situation than every 
nova-compute has a unshare ironic
And these ironic share a nova and a neutron


I'm not quite following you... what do you mean by "has a unshare ironic"?

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [openstack] [ironic] Does Ironic support that different nova-compute map to different ironic endpoint?

2018-01-02 Thread Jay Pipes


On 01/02/2018 09:10 AM, Guo James wrote:

I mean that there are two nova-compute In a OpenStack environment.
Every nova-compute are configured to map to baremental.
They communicate with different ironic endpoint.


I see. So, two different ironic-api service endpoints.


That means there are two ironic, a nova, a neutron in a OpenStack environment

Does everything go well?


Sure, that should work just fine.

Best,
-jay


Thanks


-Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com]
Sent: Tuesday, January 02, 2018 8:59 PM
To: openstack@lists.openstack.org
Subject: Re: [Openstack] [openstack] [ironic] Does Ironic support that different
nova-compute map to different ironic endpoint?

On 01/02/2018 06:09 AM, Guo James wrote:

Hi guys
I know that Ironic has support multi-nova-compute.
But I am not sure whether OpenStack support the situation than every
nova-compute has a unshare ironic And these ironic share a nova and a
neutron


I'm not quite following you... what do you mean by "has a unshare ironic"?

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [nova] Log files on exceeding cpu allocation limit

2018-08-08 Thread Jay Pipes


On 08/08/2018 09:37 AM, Cody wrote:

On 08/08/2018 07:19 AM, Bernd Bausch wrote:

I would think you don't even reach the scheduling stage. Why bother
looking for a suitable compute node if you exceeded your quota anyway?

The message is in the conductor log because it's the conductor that does
most of the work. The others are just slackers (like nova-api) or wait
for instructions from the conductor.

The above is my guess, of course, but IMHO a very educated one.

Bernd.


Thank you, Bernd. I didn't know the inner workflow in this case.
Initially, I thought it was for the scheduler to discover that no more
resource was left available, hence I expected to see something from
the scheduler log. My understanding now is that the quota get checked
in the database prior to the deployment. That would explain why the
clue was in the nova-conductor.log, not the nova-scheduler.log.


Quota is checked in the nova-api node, not the nova-conductor.

As I said in my previous message, unless you paste what the logs are 
that you are referring to, it's not possible to know what you are 
referring to.


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [nova] Log files on exceeding cpu allocation limit

2018-08-08 Thread Jay Pipes

py", line
139, in select_destinations
2018-08-08 09:28:35.974 1648 ERROR nova.conductor.manager raise
exception.NoValidHost(reason="")
2018-08-08 09:28:35.974 1648 ERROR nova.conductor.manager
2018-08-08 09:28:35.974 1648 ERROR nova.conductor.manager NoValidHost:
No valid host was found.
2018-08-08 09:28:35.974 1648 ERROR nova.conductor.manager
2018-08-08 09:28:35.974 1648 ERROR nova.conductor.manager
2018-08-08 09:28:36.328 1648 WARNING nova.scheduler.utils
[req-ef0d8ea1-e801-483e-b913-9148a6ac5d90
2499343cbc7a4ca5a7f14c43f9d9c229 3850596606b7459d8802a72516991a19 -
default default] Failed to compute_task_build_instances: No valid host
was found.
Traceback (most recent call last):

   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py",
line 226, in inner
 return func(*args, **kwargs)

   File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py",
line 139, in select_destinations
 raise exception.NoValidHost(reason="")

NoValidHost: No valid host was found.
: NoValidHost_Remote: No valid host was found.
2018-08-08 09:28:36.331 1648 WARNING nova.scheduler.utils
[req-ef0d8ea1-e801-483e-b913-9148a6ac5d90
2499343cbc7a4ca5a7f14c43f9d9c229 3850596606b7459d8802a72516991a19 -
default default] [instance: b466a974-06ba-459b-bc04-2ccb2b3ee720]
Setting instance to ERROR state.: NoValidHost_Remote: No valid host
was found.
### END ###
On Wed, Aug 8, 2018 at 9:45 AM Jay Pipes  wrote:


On 08/08/2018 09:37 AM, Cody wrote:

On 08/08/2018 07:19 AM, Bernd Bausch wrote:

I would think you don't even reach the scheduling stage. Why bother
looking for a suitable compute node if you exceeded your quota anyway?

The message is in the conductor log because it's the conductor that does
most of the work. The others are just slackers (like nova-api) or wait
for instructions from the conductor.

The above is my guess, of course, but IMHO a very educated one.

Bernd.


Thank you, Bernd. I didn't know the inner workflow in this case.
Initially, I thought it was for the scheduler to discover that no more
resource was left available, hence I expected to see something from
the scheduler log. My understanding now is that the quota get checked
in the database prior to the deployment. That would explain why the
clue was in the nova-conductor.log, not the nova-scheduler.log.


Quota is checked in the nova-api node, not the nova-conductor.

As I said in my previous message, unless you paste what the logs are
that you are referring to, it's not possible to know what you are
referring to.

Best,
-jay


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [nova] Log files on exceeding cpu allocation limit

2018-08-07 Thread Jay Pipes


On 08/07/2018 10:57 AM, Cody wrote:

Hi everyone,

I intentionally triggered an error by launching more instances than it 
is allowed by the 'cpu_allocation_ratio' set on a compute node. When it 
comes to logs, the only place contained a clue to explain the launch 
failure was in the nova-conductor.log on a controller node. Why there is 
no trace in the nova-scheduler.log (or any other logs) for this type or 
errors?


Because it's not an error.

You exceeded the capacity of your resources, that's all.

Are you asking why there isn't a way to *check* to see whether a 
particular request to launch a VM (or multiple VMs) will exceed the 
capacity of your deployment?


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [nova] Log files on exceeding cpu allocation limit

2018-08-08 Thread Jay Pipes


On 08/08/2018 07:19 AM, Bernd Bausch wrote:

I would think you don't even reach the scheduling stage. Why bother
looking for a suitable compute node if you exceeded your quota anyway?

The message is in the conductor log because it's the conductor that does
most of the work. The others are just slackers (like nova-api) or wait
for instructions from the conductor.

The above is my guess, of course, but IMHO a very educated one.

Bernd.

On 8/8/2018 1:35 AM, Cody wrote:

Hi Jay,

Thank you for getting back to my question.

I agree that it is not an error; only a preset limit is reached. I
just wonder why this incident only got recorded in the
nova-conductor.log, but not in other files such as nova-scheduler.log,
which would make more sense to me. :-)


I gave up trying to answer this because the original poster did not 
include any information about an "error" in either the original post [1] 
or his reply.


So I have no idea what got recorded in the nova-conductor log at all.

Until I get some details I have no idea how to further answer the 
question (or even if there *is* a question still?).


[1] http://lists.openstack.org/pipermail/openstack/2018-August/046804.html


By the way, I am using the Queens release.

Regards,





___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [nova]Capacity discrepancy between command line and MySQL query

2018-08-28 Thread Jay Pipes


On 08/27/2018 09:40 AM, Risto Vaaraniemi wrote:

Hi,

I tried to migrate a guest to another host but it failed with a
message saying there's not enough capacity on the target host even
though the server should me nearly empty. The guest I'm trying to
move needs 4 cores, 4 GB of memory and 50 GB of disk. Each compute
node should have 20 cores, 128 GB RAM & 260 GB HD space.

When I check it with "openstack host show compute1" I see that there's
plenty of free resources. However, when I check it directly in MariaDB
nova_api or using Placement API calls I see different results i.e. not
enough cores & disk.

Is there a safe way to make the different registries / databases to
match? Can I just overwrite it using the Placement API?

I'm using Pike.

BR,
Risto

PS
I did make a few attempts to resize the guest that now runs on
compute1 but for some reason they failed and by default the resize
tries to restart the resized guest on a different host (compute1).
In the end I was able to do the resize on the same host (compute2).
I was wondering if the resize attempts messed up the compute1 resource
management.


Very likely, yes.

It's tough to say what exact sequence of resize and migrate commands 
have caused your inventory and allocation records in placement to become 
corrupted.


Have you tried restarting the nova-compute services on both compute 
nodes and seeing whether the placement service tries to adjust 
allocations upon restart?


Also, please check the logs on the nova-compute workers looking for any 
warnings or errors related to communication with placement.


Best,
-jay


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] NUMA some of the time?

2018-07-16 Thread Jay Pipes


On 07/16/2018 10:30 AM, Toni Mueller wrote:


Hi Jay,

On Fri, Jul 06, 2018 at 12:46:04PM -0400, Jay Pipes wrote:

There is no current way to say "On this dual-Xeon compute node, put all
workloads that don't care about dedicated CPUs on this socket and all
workloads that DO care about dedicated CPUs on the other socket.".


it turned out that this is not what I should want to say. What I should
say instead is:

"Run all VMs on all cores, but if certain VMs suddenly spike, give them
all they ask for at the expense of everyone else, and also avoid moving
them around between cores, if possible."

The idea is that these high priority VMs are (probably) idle most of the
time, but at other times need high performance. It was thus deemed to be
a huge waste to reserve cores for them.


You're looking for something like VMWare DRS, then:

https://www.vmware.com/products/vsphere/drs-dpm.html

This isn't something Nova is looking to implement.

Best,
-jay


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [nova] Nova-scheduler: when are filters applied?

2018-08-30 Thread Jay Pipes


On 08/30/2018 10:54 AM, Eugen Block wrote:

Hi Jay,

You need to set your ram_allocation_ratio nova.CONF option to 1.0 if 
you're running into OOM issues. This will prevent overcommit of memory 
on your compute nodes.


I understand that, the overcommitment works quite well most of the time.

It just has been an issue twice when I booted an instance that had been 
shutdown a while ago. In the meantime there were new instances created 
on that hypervisor, and this old instance caused the OOM.


I would expect that with a ratio of 1.0 I would experience the same 
issue, wouldn't I? As far as I understand the scheduler only checks at 
instance creation, not when booting existing instances. Is that a 
correct assumption?


To echo what cfriesen said, if you set your allocation ratio to 1.0, the 
system will not overcommit memory. Shut down instances consume memory 
from an inventory management perspective. If you don't want any danger 
of an instance causing an OOM, you must set you ram_allocation_ratio to 1.0.


The scheduler doesn't really have anything to do with this.

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] NUMA some of the time?

2018-07-06 Thread Jay Pipes


Hi Tony,

The short answer is that you cannot do that today. Today, each Nova 
compute node is either "all in" for NUMA and CPU pinning or it's not.


This means that for resource-constrained environments like "The Edge!", 
there are not very good ways to finely divide up a compute node and make 
the most efficient use of its resources.


There is no current way to say "On this dual-Xeon compute node, put all 
workloads that don't care about dedicated CPUs on this socket and all 
workloads that DO care about dedicated CPUs on the other socket.".


That said, we have had lengthy discussions about tracking dedicated 
guest CPU resources and dividing up the available logical host 
processors into buckets for "shared CPU" and "dedicated CPU" workloads 
on the following spec:


https://review.openstack.org/#/c/555081/

It is not going to land in Rocky. However, we should be able to make 
good progress towards the goals in that spec in early Stein.


Best,
-jay

On 07/04/2018 11:08 AM, Toni Mueller wrote:


Hi,

I am still trying to figure how to best utilise the small set of
hardware, and discovered the NUMA configuration mechanism. It allows me
to configure reserved cores for certain VMs, but it does not seem to
allow me to say "you can share these cores, but VMs of, say, appropriate
flavour take precedence and will throw you off these cores in case they
need more power".

How can I achieve that, dynamically?

TIA!


Thanks,
Toni


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [diskimage-builder] Element pip-and-virtualenv failed to install pip

2018-10-08 Thread Jay Pipes


On 09/07/2018 03:46 PM, Hang Yang wrote:

Hi there,

I'm new to the DIB tool and ran into an issue when used 2.16.0 DIB tool 
to build a CentOS based image with pip-and-virtualenv element. It failed 
at 
https://github.com/openstack/diskimage-builder/blob/master/diskimage_builder/elements/pip-and-virtualenv/install.d/pip-and-virtualenv-source-install/04-install-pip#L78 
due to cannot find pip command.


I found the /tmp/get_pip.py was there but totally empty. I have to 
manually add a wget step to retreat the get_pip.py right before the 
failed step then it worked. But should the get_pip.py be downloaded 
automatically by this 
https://github.com/openstack/diskimage-builder/blob/master/diskimage_builder/elements/pip-and-virtualenv/source-repository-pip-and-virtualenv 
? Does anyone know how could this issue happen? Thanks in advance for 
any help.


Hi Hang,

Are you using a package or a source-based installation for your dib? The 
reason I ask is because from the docs it seems that the installation 
procedure for pip is quite different depending on whether you're using a 
package or source-based install.


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] VMs cannot fetch metadata

2018-11-06 Thread Jay Pipes


https://bugs.launchpad.net/neutron/+bug/1777640

Best,
-jay

On 11/06/2018 08:21 AM, Terry Lundin wrote:

Hi all,

I've been struggling with instances suddenly not being able to fetch 
metadata from Openstack Queens (this has worked fine earlier).


Newly created VMs fail to connect to the magic ip, eg. 
http://169.254.169.254/, and won't initialize properly. Subsequently ssh 
login will fail since no key is uploaded.


The symptom is failed requests in the log

*Cirros:*
Starting network...
udhcpc (v1.20.1) started
Sending discover...
Sending select for 10.0.0.18...
Lease of 10.0.0.18 obtained, lease time 86400
route: SIOCADDRT: File exists
WARN: failed: route add -net "0.0.0.0/0" gw "10.0.0.1"
cirros-ds 'net' up at 0.94
checkinghttp://169.254.169.254/2009-04-04/instance-id
failed 1/20: up 0.94. request failed
failed 2/20: up 3.01. request failed
failed 3/20: up 5.03. request failed
failed 4/20: up 7.04. request failed

*..and on Centos6:*
ci-info: | Route |   Destination   | Gateway  | Genmask | Interface | 
Flags |
ci-info: 
+---+-+--+-+---+---+
ci-info: |   0   | 169.254.169.254 | 10.0.0.1 | 255.255.255.255 |eth0   |  
UGH  |
ci-info: |   1   | 10.0.0.0| 0.0.0.0  |  255.255.255.0  |eth0   |   
U   |
ci-info: |   2   | 0.0.0.0 | 10.0.0.1 | 0.0.0.0 |eth0   |   
UG  |
ci-info: 
+---+-+--+-+---+---+
2018-11-06 08:10:07,892 - url_helper.py[WARNING]: Calling 
'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [0/120s]: 
unexpected error ['NoneType' object has no attribute 'status_code']
2018-11-06 08:10:08,906 - url_helper.py[WARNING]: Calling 
'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [1/120s]: 
unexpected error ['NoneType' object has no attribute 'status_code']
2018-11-06 08:10:09,925 - url_helper.py[WARNING]: Calling 
'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [2/120s]: 
unexpected error ['NoneType' object has no attribute
...

Using Curl manually, eg. '/curl http://169.254.169.254/openstack/' one 
gets:


/curl: (52) Empty reply from server/

*At the same time this error is showing up in the syslog on the controller:*

Nov  6 12:51:01 controller neutron-metadata-agent[3094]:   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 460, 
in fire_timers

Nov  6 12:51:01 controller neutron-metadata-agent[3094]: timer()
Nov  6 12:51:01 controller neutron-metadata-agent[3094]:   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 
59, in __call__

Nov  6 12:51:01 controller neutron-metadata-agent[3094]: cb(*args, **kw)
Nov  6 12:51:01 controller neutron-metadata-agent[3094]:   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 
219, in main
Nov  6 12:51:01 controller neutron-metadata-agent[3094]: result = 
function(*args, **kwargs)
Nov  6 12:51:01 controller neutron-metadata-agent[3094]:   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/wsgi.py", line 793, in 
process_request
Nov  6 12:51:01 controller neutron-metadata-agent[3094]: 
proto.__init__(conn_state, self)
Nov  6 12:51:01 controller neutron-metadata-agent[3094]: TypeError: 
__init__() takes exactly 4 arguments (3 given)


*Neither rebooting the controller, reinstalling neutron, or restarting 
the services will do anything top fix this.*


Has anyone else seen this? We are using Queens with a single controller.

Kind Regards

Terje Lundin






___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [nova]

2018-08-30 Thread Jay Pipes


On 08/30/2018 10:19 AM, Eugen Block wrote:

When does Nova apply its filters (Ram, CPU, etc.)?
Of course at instance creation and (live-)migration of existing 
instances. But what about existing instances that have been shutdown and 
in the meantime more instances on the same hypervisor have been launched?


When you start one of the pre-existing instances and even with RAM 
overcommitment you can end up with an OOM-Killer resulting in forceful 
shutdowns if you reach the limits. Is there something I've been missing 
or maybe a bad configuration of my scheduler filters? Or is it the 
admin's task to keep an eye on the load?


I'd appreciate any insights or pointers to something I've missed.


You need to set your ram_allocation_ratio nova.CONF option to 1.0 if 
you're running into OOM issues. This will prevent overcommit of memory 
on your compute nodes.


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] The problem of how to update resouce allocation ratio dynamically.

2018-08-31 Thread Jay Pipes


On 08/23/2018 11:01 PM, 余婷婷 wrote:

Hi:
    Sorry fo bothering everyone. Now I update my openstack to queen,and 
use the nova-placement-api to provider resource.
   When I use "/resource_providers/{uuid}/inventories/MEMORY_MB" to 
update memory_mb allocation_ratio, and it success.But after some 
minutes,it recove to old value automatically. Then I find it report the 
value from compute_node in nova-compute automatically. But the 
allocation_ratio of compute_node was came from the nova.conf.So that 
means，We can't update the allocation_ratio until we update the 
nova.conf? But I wish to update the allocation_ratio dynamically other 
to update the nova.conf. I don't known how  to update resouce allocation 
ratio dynamically.


We are attempting to determine what is going on with the allocation 
ratios being improperly set on the following bug:


https://bugs.launchpad.net/nova/+bug/1789654

Please bear with us as we try to fix it.

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [nova] Nova-scheduler: when are filters applied?

2018-09-03 Thread Jay Pipes


On 09/03/2018 07:27 AM, Eugen Block wrote:

Hi,

To echo what cfriesen said, if you set your allocation ratio to 1.0, 
the system will not overcommit memory. Shut down instances consume 
memory from an inventory management perspective. If you don't want any 
danger of an instance causing an OOM, you must set you 
ram_allocation_ratio to 1.0.


let's forget about the scheduler, I'll try to make my question a bit 
clearer.


Let's say I have a ratio of 1.0 on my hypervisor, and let it have 24 GB 
of RAM available, ignoring the OS for a moment. Now I launch 6 
instances, each with a flavor requesting 4 GB of RAM, that would leave 
no space for further instances, right?
Then I shutdown two instances (freeing 8 GB RAM) and create a new one 
with 8 GB of RAM, the compute node is full again (assuming all instances 
actually consume all of their RAM).
Now I boot one of the shutdown instances again, the compute node would 
require additional 4 GB of RAM for that instance, and this would lead to 
OOM, isn't that correct? So a ratio of 1.0 would not prevent that from 
happening, would it?


I'm not entirely sure what you mean by "shut down an instance". Perhaps 
this is what is leading to confusion. I consider "shutting down an 
instance" to be stopping or suspending an instance.


As I mentioned below, shutdown instances consume memory from an 
inventory management perspective. If you stop or suspend an instance on 
your host, that instance is still consuming the same amount of memory in 
the placement service. You will *not* be able to launch a new instance 
on that same compute host *unless* your allocation ratio is >1.0.


Now, if by "shut down an instance", you actually mean "terminate an 
instance" or possibly "shelve and then offload an instance", then that 
is a different thing, and in both of *those* cases, resources are 
released on the compute host.


Best,
-jay


Zitat von Jay Pipes :


On 08/30/2018 10:54 AM, Eugen Block wrote:

Hi Jay,

You need to set your ram_allocation_ratio nova.CONF option to 1.0 if 
you're running into OOM issues. This will prevent overcommit of 
memory on your compute nodes.


I understand that, the overcommitment works quite well most of the time.

It just has been an issue twice when I booted an instance that had 
been shutdown a while ago. In the meantime there were new instances 
created on that hypervisor, and this old instance caused the OOM.


I would expect that with a ratio of 1.0 I would experience the same 
issue, wouldn't I? As far as I understand the scheduler only checks 
at instance creation, not when booting existing instances. Is that a 
correct assumption?


To echo what cfriesen said, if you set your allocation ratio to 1.0, 
the system will not overcommit memory. Shut down instances consume 
memory from an inventory management perspective. If you don't want any 
danger of an instance causing an OOM, you must set you 
ram_allocation_ratio to 1.0.


The scheduler doesn't really have anything to do with this.

Best,
-jay

___
Mailing list: 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Post to : openstack@lists.openstack.org
Unsubscribe : 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack




___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] unexpected distribution of compute instances in queens

2018-12-03 Thread Jay Pipes


On 11/30/2018 05:52 PM, Mike Carden wrote:


Have you set the placement_randomize_allocation_candidates CONF option
and are still seeing the packing behaviour?


No I haven't. Where would be the place to do that? In a nova.conf 
somewhere that the nova-scheduler containers on the controller hosts 
could pick it up?


Just about to deploy for realz with about forty x86 compute nodes, so it 
would be really nice to sort this first. :)


Presuming you are deploying Rocky or Queens,

It goes in the nova.conf file under the [placement] section:

randomize_allocation_candidates = true

The nova.conf file should be the one used by nova-scheduler.

Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] unexpected distribution of compute instances in queens

2018-11-30 Thread Jay Pipes


On 11/30/2018 02:53 AM, Mike Carden wrote:

I'm seeing a similar issue in Queens deployed via tripleo.

Two x86 compute nodes and one ppc64le node and host aggregates for 
virtual instances and baremetal (x86) instances. Baremetal on x86 is 
working fine.


All VMs get deployed to compute-0. I can live migrate VMs to compute-1 
and all is well, but I tire of being the 'meatspace scheduler'.


LOL, I love that term and will have to remember to use it in the future.

I've looked at the nova.conf in the various nova-xxx containers on the 
controllers, but I have failed to discern the root of this issue.


Have you set the placement_randomize_allocation_candidates CONF option 
and are still seeing the packing behaviour?


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] unexpected distribution of compute instances in queens

2018-11-28 Thread Jay Pipes


On 11/28/2018 02:50 AM, Zufar Dhiyaulhaq wrote:

Hi,

Thank you. I am able to fix this issue by adding this configuration into 
nova configuration file in controller node.


driver=filter_scheduler


That's the default:

https://docs.openstack.org/ocata/config-reference/compute/config-options.html

So that was definitely not the solution to your problem.

My guess is that Sean's suggestion to randomize the allocation 
candidates fixed your issue.


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [nova][cinder] Migrate instances between regions or between clusters?

2018-09-17 Thread Jay Pipes


On 09/17/2018 09:39 AM, Peter Penchev wrote:

Hi,

So here's a possibly stupid question - or rather, a series of such :)
Let's say a company has two (or five, or a hundred) datacenters in
geographically different locations and wants to deploy OpenStack in both.
What would be a deployment scenario that would allow relatively easy
migration (cold, not live) of instances from one datacenter to another?

My understanding is that for servers located far away from one another
regions would be a better metaphor than availability zones, if only
because it would be faster for the various storage, compute, etc.
services to communicate with each other for the common case of doing
actions within the same datacenter.  Is this understanding wrong - is it
considered all right for groups of servers located in far away places to
be treated as different availability zones in the same cluster?

If the groups of servers are put in different regions, though, this
brings me to the real question: how can an instance be migrated across
regions?  Note that the instance will almost certainly have some
shared-storage volume attached, and assume (not quite the common case,
but still) that the underlying shared storage technology can be taught
about another storage cluster in another location and can transfer
volumes and snapshots to remote clusters.  From what I've found, there
are three basic ways:

- do it pretty much by hand: create snapshots of the volumes used in
   the underlying storage system, transfer them to the other storage
   cluster, then tell the Cinder volume driver to manage them, and spawn
   an instance with the newly-managed newly-transferred volumes


Yes, this is a perfectly reasonable solution. In fact, when I was at 
AT, this was basically how we allowed tenants to spin up instances in 
multiple regions: snapshot the instance, it gets stored in the Swift 
storage for the region, tenant starts the instance in a different 
region, and Nova pulls the image from the Swift storage in the other 
region. It's slow the first time it's launched in the new region, of 
course, since the bits need to be pulled from the other region's Swift 
storage, but after that, local image caching speeds things up quite a bit.


This isn't migration, though. Namely, the tenant doesn't keep their 
instance ID, their instance's IP addresses, or anything like that.


I've heard some users care about that stuff, unfortunately, which is why 
we have shelve [offload]. There's absolutely no way to perform a 
cross-region migration that keeps the instance ID and instance IP addresses.



- use Cinder to backup the volumes from one region, then restore them to
   the other; if this is combined with a storage-specific Cinder backup
   driver that knows that "backing up" is "creating a snapshot" and
   "restoring to the other region" is "transferring that snapshot to the
   remote storage cluster", it seems to be the easiest way forward (once
   the Cinder backup driver has been written)


Still won't have the same instance ID and IP address, which is what 
certain users tend to complain about needing with move operations.



- use Nova's "server image create" command, transfer the resulting
   Glance image somehow (possibly by downloading it from the Glance
   storage in one region and simulateneously uploading it to the Glance
   instance in the other), then spawn an instance off that image


Still won't have the same instance ID and IP address :)

Best,
-jay


The "server image create" approach seems to be the simplest one,
although it is a bit hard to imagine how it would work without
transferring data unnecessarily (the online articles I've seen
advocating it seem to imply that a Nova instance in a region cannot be
spawned off a Glance image in another region, so there will need to be
at least one set of "download the image and upload it to the other
side", even if the volume-to-image and image-to-volume transfers are
instantaneous, e.g. using glance-cinderclient).  However, when I tried
it with a Nova instance backed by a StorPool volume (no ephemeral image
at all), the Glance image was zero bytes in length and only its metadata
contained some information about a volume snapshot created at that
point, so this seems once again to go back to options 1 and 2 for the
different ways to transfer a Cinder volume or snapshot to the other
region.  Or have I missed something, is there a way to get the "server
image create / image download / image create" route to handle volumes
attached to the instance?

So... have I missed something else, too, or are these the options for
transferring a Nova instance between two distant locations?

Thanks for reading this far, and thanks in advance for your help!

Best regards,
Peter



___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe :

Re: [openstack-dev] [Neutron][policy] Group Based Policy - Renaming

2014-08-08 Thread Jay Pipes


On 08/07/2014 01:17 PM, Ronak Shah wrote:

Hi,
Following a very interesting and vocal thread on GBP for last couple of
days and the GBP meeting today, GBP sub-team proposes following name
changes to the resource.


policy-point for endpoint
policy-group for endpointgroup (epg)

Please reply if you feel that it is not ok with reason and suggestion.


Thanks Ronak and Sumit for sharing. I, too, wasn't able to attend the 
meeting (was in other meetings yesterday and today).


I'm very happy with the change from endpoint-group - policy-group.

policy-point is better than endpoint, for sure. The only other 
suggestion I might have would be to use policy-target instead of 
policy-point, since the former clearly delineates what the object is 
used for (a target for a policy).


But... I won't raise a stink about this. Sorry for sparking long and 
tangential discussions on GBP topics earlier this week. And thanks to 
the folks who persevered and didn't take too much offense to my questioning.


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Fwd: FW: [Neutron] Group Based Policy and the way forward

2014-08-08 Thread Jay Pipes

On 08/08/2014 08:55 AM, Kevin Benton wrote:

The existing constructs will not change.

A followup question on the above...

If GPB API is merged into Neutron, the next logical steps (from what I
can tell) will be to add drivers that handle policy-based payloads/requests.

Some of these drivers, AFAICT, will *not* be deconstructing these policy
requests into the low-level port, network, and subnet
creation/attachment/detachment commands, but instead will be calling out
as-is to hardware that speaks the higher-level abstraction API [1], not
the lower-level port/subnet/network APIs. The low-level APIs would
essentially be consumed entirely within the policy-based driver, which
would effectively mean that the only way a system would be able to
orchestrate networking in systems using these drivers would be via the
high-level policy API.

Is that correct? Very sorry if I haven't explained clearly my
question... this is a tough question to frame eloquently :(

Thanks,
-jay

[1]
http://www.cisco.com/c/en/us/solutions/data-center-virtualization/application-centric-infrastructure/index.html

On Aug 8, 2014 9:49 AM, CARVER, PAUL pc2...@att.com
mailto:pc2...@att.com wrote:

Wuhongning [mailto:wuhongn...@huawei.com
mailto:wuhongn...@huawei.com] wrote:

Does it make sense to move all advanced extension out of ML2, like
security
group, qos...? Then we can just talk about advanced service
itself, without
bothering basic neutron object (network/subnet/port)

A modular layer 3 (ML3) analogous to ML2 sounds like a good idea. I
still
think it's too late in the game to be shooting down all the work
that the
GBP team has put in unless there's a really clean and effective way of
running AND iterating on GBP in conjunction with Neutron without being
part of the Juno release. As far as I can tell they've worked really
hard to follow the process and accommodate input. They shouldn't have
to wait multiple more releases on a hypothetical refactoring of how
L3+ vs
L2 is structured.

But, just so I'm not making a horrible mistake, can someone reassure me
that GBP isn't removing the constructs of network/subnet/port from
Neutron?

I'm under the impression that GBP is adding a higher level abstraction
but that it's not ripping basic constructs like network/subnet/port out
of the existing API. If I'm wrong about that I'll have to change my
opinion. We need those fundamental networking constructs to be present
and accessible to users that want/need to deal with them. I'm viewing
GBP as just a higher level abstraction over the top.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Fwd: FW: [Neutron] Group Based Policy and the way forward

2014-08-08 Thread Jay Pipes


On 08/08/2014 12:29 PM, Sumit Naiksatam wrote:

Hi Jay, To extend Ivar's response here, the core resources and core
plugin configuration does not change with the addition of these
extensions. The mechanism to implement the GBP extensions is via a
service plugin. So even in a deployment where a GBP service plugin is
deployed with a driver which interfaces with a backend that perhaps
directly understands some of the GBP constructs, that system would
still need to have a core plugin configured that honors Neutron's core
resources. Hence my earlier comment that GBP extensions are
complementary to the existing core resources (in much the same way as
the existing extensions in Neutron).


OK, thanks Sumit. That clearly explains things for me.

Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Glance] Image upload/download bandwidth cap

2014-08-08 Thread Jay Pipes


On 08/08/2014 08:49 AM, Tomoki Sekiyama wrote:

Hi all,

I'm considering how I can apply image download/upload bandwidth limit for
glance for network QoS.

There was a review for the bandwidth limit, however it is abandoned.

* Download rate limiting
   https://review.openstack.org/#/c/21380/

Was there any discussion in the past summit about this not to merge this?
Or, is there alternative way to cap the bandwidth consumed by Glance?

I appreciate any information about this.


Hi Tomoki :)

Would it be possible to integrate traffic control into the network 
configuration between the Glance endpoints and the nova-compute nodes 
over the control plane network?


http://www.lartc.org/lartc.html#LARTC.RATELIMIT.SINGLE

Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Neutron] Is network ordering of vNICs guaranteed?

2014-08-09 Thread Jay Pipes

Paul, does this friend of a friend have a reproduceable test script for 
this?


Thanks!
-jay

On 08/08/2014 04:42 PM, Kevin Benton wrote:

If this is true, I think the issue is not on Neutron side but the Nova
side.
Neutron just receives and handles individual port requests. It has no
notion of the order in which they are attached to the VM.

Can you add the Nova tag to get some visibility to the Nova devs?


On Fri, Aug 8, 2014 at 11:32 AM, CARVER, PAUL pc2...@att.com
mailto:pc2...@att.com wrote:

I’m hearing “friend of a friend” that people have looked at the code
and determined that the order of networks on a VM is not guaranteed.
Can anyone confirm whether this is true? If it is true, is there any
reason why this is not considered a bug? I’ve never seen it happen
myself.

__ __

To elaborate, I’m being told that if you create some VMs with
several vNICs on each and you want them to be, for example:

__ __

__1)__Management Network

__2)__Production Network

__3)__Storage Network

__ __

You can’t count on all the VMs having eth0 connected to the
management network, eth1 on the production network, eth2 on the
storage network.

__ __

I’m being told that they will come up like that most of the time,
but sometimes you will see, for example, a VM might wind up with
eth0 connected to the production network, eth1 to the storage
network, and eth2 connected to the storage network (or some other
permutation.)

__ __

__ __


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--
Kevin Benton


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [OpenStack][InstanceGroup] metadetails and metadata in instance_group.py

2014-08-11 Thread Jay Pipes


On 08/10/2014 10:36 PM, Jay Lau wrote:

I was asking this because I got a -2 for
https://review.openstack.org/109505 , just want to know why this new
term metadetails was invented when we already have details,
metadata, system_metadata, instance_metadata, and properties (on
images and volumes).


As the person who -2'd the review, I'm thankful you raised this issue on 
the ML, Jay. Much appreciated.


Eagerly awaiting answers,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Neutron] Is network ordering of vNICs guaranteed?

2014-08-11 Thread Jay Pipes


Thanks, Paul!

On 08/11/2014 10:10 AM, CARVER, PAUL wrote:

Armando M. [mailto:arma...@gmail.com] wrote:


On 9 August 2014 10:16, Jay Pipes jaypi...@gmail.com wrote:



Paul, does this friend of a friend have a reproduceable test



script for this?



We would also need to know the OpenStack release where this issue manifest



itself. A number of bugs have been raised in the past around this type of



issue, and the last fix I recall is this one:







https://bugs.launchpad.net/nova/+bug/1300325







It's possible that this might have regressed, though.


The reason I called it friend of a friend is because I think the info

has filtered through a series of people and is not firsthand observation.

I'll ask them to track back to who actually observed the behavior, how

long ago, and with what version.

It could be a regression, or it could just be old info that people have

continued to assume is true without realizing it was considered a bug

all along and has been fixed.

Thanks! The moment I first heard it my first reaction was that it was

almost certainly a bug and had probably already been fixed.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [OpenStack][InstanceGroup] metadetails and metadata in instance_group.py

2014-08-11 Thread Jay Pipes


On 08/11/2014 11:06 AM, Dan Smith wrote:

As the person who -2'd the review, I'm thankful you raised this issue on
the ML, Jay. Much appreciated.


The metadetails term isn't being invented in this patch, of course. I
originally complained about the difference when this was being added:

https://review.openstack.org/#/c/109505/1/nova/api/openstack/compute/contrib/server_groups.py,cm

As best I can tell, the response in that patch set about why it's being
translated is wrong (backwards). I expect that the API extension at the
time called it metadetails and they decided to make the object the
same and do the translation there.

 From what I can tell, the actual server_group API extension that made it
into the tree never got the ability to set/change/etc the
metadata/metadetails anyway, so there's no reason (AFAICT) to add it in
wrongly.

If we care to have this functionality, then I propose we change the
attribute on the object (we can handle this with versioning) and reflect
it as metadata in the API.

However, I have to ask: do we really need another distinct metadata
store attached to server_groups?


No.

 If not, how about we just remove it

from the database and the object, clean up the bit of residue that is
still in the API extension and be done with it?


+1

-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db]A proposal for DB read/write separation

2014-08-11 Thread Jay Pipes


Hi Li, comments inline.

On 08/08/2014 12:03 AM, Li Ma wrote:

Getting a massive amount of information from data storage to be displayed is
where most of the activity happens in OpenStack. The two activities of reading
data and writing (creating, updating and deleting) data are fundamentally
different.

The optimization for these two opposite database activities can be done by
physically separating the databases that service these two different
activities. All the writes go to database servers, which then replicates the
written data to the database server(s) dedicated to servicing the reads.

Currently, AFAIK, many OpenStack deployment in production try to take
advantage of MySQL (includes Percona or MariaDB) multi-master Galera cluster.
It is possible to design and implement a read/write separation schema
for such a DB cluster.


The above does not really make sense for MySQL Galera/PXC clusters *if 
only Galera nodes are used in the cluster*. Since Galera is 
synchronously replicated, there's no real point in segregating writers 
from readers, IMO. Better to just spread the write AND read load equally 
among all Galera cluster nodes.


However, if you have a Galera cluster that then slaves off to one or 
more standard MySQL slaves, then certainly doing writer/reader 
segregation could be useful, especially for directing readers of 
aggregate or report-type data to the read-only slaves.



Actually, OpenStack has a method for read scalability via defining
master_connection and slave_connection in configuration, but this method
lacks of flexibility due to deciding master or slave in the logical
context(code). It's not transparent for application developer.
As a result, it is not widely used in all the OpenStack projects.

So, I'd like to propose a transparent read/write separation method
for oslo.db that every project may happily takes advantage of it
without any code modification.


I've never seen a writer/reader segregation proxy or middleware piece 
that was properly able to send the right reads to the slaves. 
Unfortunately, determining what are the right reads to send to the 
slaves is highly application-dependent, since the application knows when 
it can tolerate slave lags.



Moreover, I'd like to put it in the mailing list in advance to
make sure it is acceptable for oslo.db.


I think oslo.db is not the right place for this. I believe the efforts 
that Mike Wilson has been doing in the slavification blueprints are 
the more appropriate place to add this slave-aware code.


Best,
-jay


I'd appreciate any comments.

br.
Li Ma


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [OpenStack][InstanceGroup] metadetails and metadata in instance_group.py

2014-08-11 Thread Jay Pipes


On 08/11/2014 05:58 PM, Jay Lau wrote:

I think the metadata in server group is an important feature and it
might be used by
https://blueprints.launchpad.net/nova/+spec/soft-affinity-for-server-group

Actually, we are now doing an internal development for above bp and want
to contribute this back to community later. We are now setting hard/soft
flags in server group metadata to identify if the server group want
hard/soft affinity.

I prefer Dan's first suggestion, what do you think?
=
If we care to have this functionality, then I propose we change the
attribute on the object (we can handle this with versioning) and reflect
it as metadata in the API.
=


-1

If hard and soft is something that really needs to be supported, then 
this should be a field in the instance_groups table, not some JSON blob 
in a random metadata field.


Better yet, get rid of the instance_groups table altogether and have 
near, not-near, hard, and soft be launch modifiers similar to 
the instance type. IMO, there's really no need to store a named group at 
all, but that goes back to my original ML post about the server groups 
topic:


https://www.mail-archive.com/openstack-dev@lists.openstack.org/msg23055.html

Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Retrospective veto revert policy

2014-08-12 Thread Jay Pipes


On 08/12/2014 10:56 AM, Mark McLoughlin wrote:

Hey

(Terrible name for a policy, I know)

 From the version_cap saga here:

   https://review.openstack.org/110754

I think we need a better understanding of how to approach situations
like this.

Here's my attempt at documenting what I think we're expecting the
procedure to be:

   https://etherpad.openstack.org/p/nova-retrospective-veto-revert-policy

If it sounds reasonably sane, I can propose its addition to the
Development policies doc.


Eminently reasonable. +1

-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] 9 days until feature proposal freeze

2014-08-12 Thread Jay Pipes


On 08/12/2014 04:13 AM, Michael Still wrote:

Hi,

this is just a friendly reminder that we are now 9 days away from
feature proposal freeze for nova. If you think your blueprint isn't
going to make it in time, then now would be a good time to let me know
so that we can defer it until Kilo. That will free up reviewer time
for other blueprints.

Some people have more than one blueprint still under development...
Perhaps they could defer some of those to Kilo?


I removed 
https://blueprints.launchpad.net/nova/+spec/allocation-ratio-to-resource-tracker 
from the Juno cycle, and noted reasons why in the whiteboard (ongoing 
discussions around scheduler separation and the scope of the resource 
tracker in regards to claim processing.


Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit

2014-08-13 Thread Jay Pipes


On 08/13/2014 06:35 PM, Russell Bryant wrote:

On 08/13/2014 06:23 PM, Mark McLoughlin wrote:

On Wed, 2014-08-13 at 12:05 -0700, James E. Blair wrote:

cor...@inaugust.com (James E. Blair) writes:


Sean Dague s...@dague.net writes:


This has all gone far enough that someone actually wrote a Grease Monkey
script to purge all the 3rd Party CI content out of Jenkins UI. People
are writing mail filters to dump all the notifications. Dan Berange
filters all them out of his gerrit query tools.


I should also mention that there is a pending change to do something
similar via site-local Javascript in our Gerrit:

   https://review.openstack.org/#/c/95743/

I don't think it's an ideal long-term solution, but if it works, we may
have some immediate relief without all having to install greasemonkey
scripts.


You may have noticed that this has merged, along with a further change
that shows the latest results in a table format.  (You may need to
force-reload in your browser to see the change.)


Beautiful! Thank you so much to everyone involved.


+1!  Love this.


Indeed. Amazeballs.

-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Is there a way to let nova schedule plugin fetch glance image metadata

2014-08-13 Thread Jay Pipes


On 08/13/2014 08:31 PM, zhiwei wrote:

Hi all,

We wrote a nova schedule plugin that need to fetch image metadata by
image_id, but encountered one thing, we did not have the glance context.

Our solution is to configure OpenStack admin user and password to
nova.conf, as you know this is not good.

So, I want to ask if there are any other ways to do this?


You should not have to do a separate fetch of image metadata in a 
scheduler filter (which is what I believe you meant by plugin above?).


The filter object's host_passes() method has a filter_properties 
parameter that contains the request_spec, that in turn contains the 
image, which in turn contains the image metadata. You can access it 
like so:


 def host_passes(self, host_state, filter_properties):
 request_spec = filter_properties['request_spec']
 image_info = request_spec['image']
 # Certain image attributes are accessed via top-level keys, like
 # size, disk_format, container_format and checksum
 image_size = image_info['size']
 # Other attributes can be accessed in the properties collection
 # of key/value pairs
 image_props =  image.get('properties', {})
 for key, value in image_props.items():
 # do something...

Best,
-jay



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Is there a way to let nova schedule plugin fetch glance image metadata

2014-08-13 Thread Jay Pipes


On 08/13/2014 10:22 PM, zhiwei wrote:

Thanks Jay.

The scheduler plugin is not a scheduler filter.

We implemented a scheduler instead of using nova native scheduler.


OK. Any reason why you did this? Without any details on what your 
scheduler does, it's tough to give advice on how to solve your problems.



One of our scheduler component need to fetch image metadata by image_id(
at this time, there is not instance ).


Why? Again, the request_spec contains all the information you need about 
the image...


Best,
-jay


On Thu, Aug 14, 2014 at 9:29 AM, Jay Pipes jaypi...@gmail.com
mailto:jaypi...@gmail.com wrote:

On 08/13/2014 08:31 PM, zhiwei wrote:

Hi all,

We wrote a nova schedule plugin that need to fetch image metadata by
image_id, but encountered one thing, we did not have the glance
context.

Our solution is to configure OpenStack admin user and password to
nova.conf, as you know this is not good.

So, I want to ask if there are any other ways to do this?


You should not have to do a separate fetch of image metadata in a
scheduler filter (which is what I believe you meant by plugin above?).

The filter object's host_passes() method has a filter_properties
parameter that contains the request_spec, that in turn contains the
image, which in turn contains the image metadata. You can access
it like so:

  def host_passes(self, host_state, filter_properties):
  request_spec = filter_properties['request___spec']
  image_info = request_spec['image']
  # Certain image attributes are accessed via top-level keys, like
  # size, disk_format, container_format and checksum
  image_size = image_info['size']
  # Other attributes can be accessed in the properties collection
  # of key/value pairs
  image_props =  image.get('properties', {})
  for key, value in image_props.items():
  # do something...

Best,
-jay



_
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-13 Thread Jay Pipes


On 08/12/2014 06:57 PM, Michael Still wrote:

Hi.

One of the action items from the nova midcycle was that I was asked to
make nova's expectations of core reviews more clear. This email is an
attempt at that.

Nova expects a minimum level of sustained code reviews from cores. In
the past this has been generally held to be in the order of two code
reviews a day, which is a pretty low bar compared to the review
workload of many cores. I feel that existing cores understand this
requirement well, and I am mostly stating it here for completeness.

Additionally, there is increasing levels of concern that cores need to
be on the same page about the criteria we hold code to, as well as the
overall direction of nova. While the weekly meetings help here, it was
agreed that summit attendance is really important to cores. Its the
way we decide where we're going for the next cycle, as well as a
chance to make sure that people are all pulling in the same direction
and trust each other.

There is also a strong preference for midcycle meetup attendance,
although I understand that can sometimes be hard to arrange. My stance
is that I'd like core's to try to attend, but understand that
sometimes people will miss one. In response to the increasing
importance of midcycles over time, I commit to trying to get the dates
for these events announced further in advance.

Given that we consider these physical events so important, I'd like
people to let me know if they have travel funding issues. I can then
approach the Foundation about funding travel if that is required.


Just wanted to quickly weigh in with my thoughts on this important 
topic. I very much valued the face-to-face interaction that came from 
the mid-cycle meetup in Beaverton (it was the only one I've ever been to).


That said, I do not believe it should be a requirement that cores make 
it to the face-to-face meetings in-person. A number of folks have 
brought up very valid concerns about personal/family time, travel costs 
and burnout.


I believe that the issue raised about furthering the divide between core 
and non-core folks is actually the biggest reason I don't support a 
mandate to have cores at the face-to-face meetings, and I think we 
should make our best efforts to support quality virtual meetings that 
can be done on a more frequent basis than the face-to-face meetings that 
would be optional.


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Bug: resize on same node with allow_resize_to_same_host=True

2014-08-13 Thread Jay Pipes


On 08/13/2014 11:50 PM, Manickam, Kanagaraj wrote:

Hi,

Nova provides a flag ‘allow_resize_to_same_host’ to resize the given
instance on the same hypervisor where it is currently residing. When
this flag is set to True, the nova.compute.api: resize() method does not
set the scheduler hint with ‘force_nodes’, where as its set the
‘ignored_hosts’ properly when this flag is set to False.

So I have filed following defect to fix the logic when this flag is set
to True.

https://bugs.launchpad.net/nova/+bug/1356309

I felt this defect is import to fix, when cloud admin wants the resize
to be happen on the same hypervisor (compute node). So could you please
let me know whether I can fix this for Juno-3? Thanks.


Hi Kanagaraj,

AFAICT, there is no bug here.

if not CONF.allow_resize_to_same_host:
filter_properties['ignore_hosts'].append(instance['host'])
else:
filter_properties['force_nodes'] = [instance['node']]

When allow_resize_to_same_host is True, then 
filter_properties['force_node'] will be set to a list with only one node 
(the compute node that the instance is currently on), and therefore the 
resize will happen on the original host.


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Is there a way to let nova schedule plugin fetch glance image metadata

2014-08-13 Thread Jay Pipes


On 08/13/2014 11:06 PM, zhiwei wrote:

Hi Jay.

The case is: When heat create a stack, it will first call our
scheduler(will pass image_id), our scheduler will get image metadata by
image_id.

Our scheduler will build a placement policy through image metadata, then
start booting VM.


How exactly is Heat calling your scheduler? The Nova scheduler does not 
have a public REST API, so I'm unsure how you are calling it.


More details needed, thanks!
 :)

-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo][db] Nominating Mike Bayer for the oslo.db core reviewers team

2014-08-15 Thread Jay Pipes


On 08/15/2014 04:21 AM, Roman Podoliaka wrote:

Hi Oslo team,

I propose that we add Mike Bayer (zzzeek) to the oslo.db core reviewers team.

Mike is an author of SQLAlchemy, Alembic, Mako Templates and some
other stuff we use in OpenStack. Mike has been working on OpenStack
for a few months contributing a lot of good patches and code reviews
to oslo.db [1]. He has also been revising the db patterns in our
projects and prepared a plan how to solve some of the problems we have
[2].

I think, Mike would be a good addition to the team.


Uhm, yeah... +10 :)

-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [QA] Picking a Name for the Tempest Library

2014-08-15 Thread Jay Pipes


On 08/15/2014 03:14 PM, Matthew Treinish wrote:

Hi Everyone,

So as part of splitting out common functionality from tempest into a library [1]
we need to create a new repository. Which means we have the fun task of coming
up with something to name it. I'm personally thought we should call it:

  - mesocyclone

Which has the advantage of being a cloud/weather thing, and the name sort of
fits because it's a precursor to a tornado. Also, it's an available namespace on
both launchpad and pypi. But there has been expressed concern that both it is a
bit on the long side (which might have 80 char line length implications) and
it's unclear from the name what it does.

During the last QA meeting some alternatives were also brought up:

  - tempest-lib / lib-tempest
  - tsepmet
  - blackstorm
  - calm
  - tempit
  - integration-test-lib

(although I'm not entirely sure I remember which ones were serious suggestions
or just jokes)

So as a first step I figured that I'd bring it up on the ML to see if anyone had
any other suggestions. (or maybe get a consensus around one choice) I'll take
the list, check if the namespaces are available, and make a survey so that
everyone can vote and hopefully we'll have a clear choice for a name from that.


I suggest that tempest should be the name of the import'able library, 
and that the integration tests themselves should be what is pulled out 
of the current Tempest repository, into their own repo called 
openstack-integration-tests or os-integration-tests.


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [QA] Picking a Name for the Tempest Library

2014-08-16 Thread Jay Pipes


On 08/16/2014 12:27 PM, Marc Koderer wrote:

Hi all,

Am 15.08.2014 um 23:31 schrieb Jay Pipes jaypi...@gmail.com:


I suggest that tempest should be the name of the import'able library, and that the 
integration tests themselves should be what is pulled out of the current Tempest repository, into their 
own repo called openstack-integration-tests or os-integration-tests“.


why not keeping it simple:

tempest: importable test library
tempest-tests: all the test cases

Simple, obvious and clear ;)


++


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-17 Thread Jay Pipes


On 08/17/2014 05:11 AM, Stan Lagun wrote:


On Fri, Aug 15, 2014 at 7:17 PM, Sandy Walsh sandy.wa...@rackspace.com
mailto:sandy.wa...@rackspace.com wrote:

I recently suggested that the Ceilometer API (and integration tests)
be separated from the implementation (two repos) so others might
plug in a different implementation while maintaining compatibility,
but that wasn't well received.

Personally, I'd like to see that model extended for all OpenStack
projects. Keep compatible at the API level and welcome competing
implementations.


Brilliant idea I'd vote for


The problem is when the API is the worst part of the project.

We have a number of projects (some that I work on) that one of the 
weakest parts of the project is the design, inconsistency, and 
efficiency of the API constructs are simply terrible.


The last thing I would want to do is say here, everyone go build 
multiple implementations on top of this crappy API. :(


As for the idea of letting the market flush out competing 
implementations, I'm all for that ... with some caveats. A couple of 
those caveats would include:


 a) Must be Python if it is to be considered as a part of OpenStack's 
integrated release [1]
 b) The API must be simple, efficient, and consistent, possibly having 
signoff by some working group focused on API standards


All the best,
-jay

[1] This isn't saying other programming languages aren't perfectly 
fine*, just that our integration and CI systems are focused on Python, 
and non-Python projects are a non-starter at this point.


* except Java, of course. That goes without saying.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-19 Thread Jay Pipes

Caution: words below may cause discomfort. I ask that folks read *all* 
of my message before reacting to any piece of it. Thanks!


On 08/19/2014 02:41 AM, Robert Collins wrote:

On 18 August 2014 09:32, Clint Byrum cl...@fewbar.com wrote:

I can see your perspective but I don't think its internally consistent...


Here's why folk are questioning Ceilometer:

Nova is a set of tools to abstract virtualization implementations.


With a big chunk of local things - local image storage (now in
glance), scheduling, rebalancing, ACLs and quotas. Other
implementations that abstract over VM's at various layers already
existed when Nova started - some bad ( some very bad!) and others
actually quite ok.


Neutron is a set of tools to abstract SDN/NFV implementations.


And implements a DHCP service, DNS service, overlay networking : its
much more than an abstraction-over-other-implementations.


Cinder is a set of tools to abstract block-device implementations.
Trove is a set of tools to simplify consumption of existing databases.
Sahara is a set of tools to simplify Hadoop consumption.
Swift is a feature-complete implementation of object storage, none of
which existed when it was started.


Swift was started in 2009; Eucalyptus goes back to 2007, with Walrus
part of that - I haven't checked precise dates, but I'm pretty sure
that it existed and was usable by the start of 2009. There may well be
other object storage implementations too - I simply haven't checked.


Keystone supports all of the above, unifying their auth.


And implementing an IdP (which I know they want to stop doing ;)). And
in fact lots of OpenStack projects, for various reasons support *not*
using Keystone (something that bugs me, but thats a different
discussion).


Horizon supports all of the above, unifying their GUI.

Ceilometer is a complete implementation of data collection and alerting.
There is no shortage of implementations that exist already.

I'm also core on two projects that are getting some push back these
days:

Heat is a complete implementation of orchestration. There are at least a
few of these already in existence, though not as many as their are data
collection and alerting systems.

TripleO is an attempt to deploy OpenStack using tools that OpenStack
provides. There are already quite a few other tools that _can_ deploy
OpenStack, so it stands to reason that people will question why we
don't just use those. It is my hope we'll push more into the unifying
the implementations space and withdraw a bit from the implementing
stuff space.

So, you see, people are happy to unify around a single abstraction, but
not so much around a brand new implementation of things that already
exist.


If the other examples we had were a lot purer, this explanation would
make sense. I think there's more to it than that though :).

What exactly, I don't know, but its just too easy an answer, and one
that doesn't stand up to non-trivial examination :(.


I actually agree with Robert about this; that Clint may have 
oversimplified whether or not certain OpenStack projects may have 
reimplemented something that previously existed. Everything is a grey 
area, after all. I'm sure each project can go back in time and point to 
some existing piece of software -- good, bad or Java -- and truthfully 
say that there was prior art that could have been used.


The issue that I think needs to be addressed more directly in this 
thread and the ongoing conversation on the TC is this:


By graduating an incubated project into the integrated release, the 
Technical Committee is blessing the project as the OpenStack way to do 
some thing. If there are projects that are developed *in the OpenStack 
ecosystem* that are actively being developed to serve the purpose that 
an integrated project serves, then I think it is the responsibility of 
the Technical Committee to take another look at the integrated project 
and answer the following questions definitively:


 a) Is the Thing that the project addresses something that the 
Technical Committee believes the OpenStack ecosystem benefits from by 
the TC making a judgement on what is the OpenStack way of addressing 
that Thing.


and IFF the decision of the TC on a) is YES, then:

 b) Is the Vision and Implementation of the currently integrated 
project the one that the Technical Committee wishes to continue to 
bless as the the OpenStack way of addressing the Thing the project does.


If either of the above answers is NO, then I believe the Technical 
Committee should recommend that the integrated project be removed from 
the integrated release.


HOWEVER, I *also* believe that the previously-integrated project should 
not just be cast away back to Stackforge. I think the project should 
remain in its designated Program and should remain in the openstack/ 
code namespace. Furthermore, active, competing visions and 
implementations of projects that address the Thing the 
previously-integrated project addressed should be

Re: [openstack-dev] [TripleO][Nova] Specs and approvals

2014-08-19 Thread Jay Pipes


On 08/19/2014 11:23 AM, Russell Bryant wrote:

On 08/19/2014 05:31 AM, Robert Collins wrote:

Hey everybody - https://wiki.openstack.org/wiki/TripleO/SpecReviews
seems pretty sane as we discussed at the last TripleO IRC meeting.

I'd like to propose that we adopt it with the following tweak:

19:46:34 lifeless so I propose that +2 on a spec is a commitment to
review it over-and-above the core review responsibilities
19:47:05 lifeless if its not important enough for a reviewer to do
that thats a pretty strong signal
19:47:06 dprince lifeless: +1, I thought we already agreed to that
at the meetup
19:47:17 slagle yea, sounds fine to me
19:47:20 bnemec +1
19:47:30 lifeless dprince: it wasn't clear whether it was
part-of-responsibility, or additive, I'm proposing we make it clearly
additive
19:47:52 lifeless and separately I think we need to make surfacing
reviews-for-themes a lot better

That is - +1 on a spec review is 'sure, I like it', +2 is specifically
I will review this *over and above* my core commitment - the goal
here is to have some very gentle choke on concurrent WIP without
needing the transition to a managed pull workflow that Nova are
discussing - which we didn't have much support for during the meeting.

Obviously, any core can -2 for any of the usual reasons - this motion
is about opening up +A to the whole Tripleo core team on specs.

Reviewers, and other interested kibbitzers, please +1 / -1 as you feel fit :)


+1

I really like this.  In fact, I like it a lot more than the current
proposal for Nova.  I think the Nova team should consider this, as well.

It still rate limits code reviews by making core reviewers explicitly
commit to reviewing things.  This is like our previous attempt at
sponsoring blueprints, but the use of gerrit I think would make it more
successful.

It also addresses my primary concerns with the tensions between group
will and small groups no longer being able to self organize and push
things to completion without having to haggle through yet another process.


+1

Me likee.
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [neutron]Performance of security group

2014-08-20 Thread Jay Pipes


On 08/20/2014 07:34 AM, Miguel Angel Ajo Pelayo wrote:

I couldn't resist making a little benchmark test of the new RPC implementation
shihanzhang wrote:

http://www.ajo.es/post/95269040924/neutron-security-group-rules-for-devices-rpc-rewrite

The results are awesome :-)


Indeed, fantastic news. ++

-jay


We yet need to polish the tests a bit, and it's ready.

Best regards,
Miguel Ángel.

- Original Message -

On Thu, Jul 10, 2014 at 4:30 AM, shihanzhang ayshihanzh...@126.com wrote:


With the deployment 'nova + neutron + openvswitch', when we bulk create
about 500 VM with a default security group, the CPU usage of neutron-server
and openvswitch agent is very high, especially the CPU usage of openvswitch
agent will be 100%, this will cause creating VMs failed.

With the method discussed in mailist:

1) ipset optimization   (https://review.openstack.org/#/c/100761/)

3) sg rpc optimization (with fanout)
(https://review.openstack.org/#/c/104522/)

I have implement  these two scheme in my deployment,  when we again bulk
create about 500 VM with a default security group, the CPU usage of
openvswitch agent will reduce to 10%, even lower than 10%, so I think the
iprovement of these two options are very efficient.

Who can help us to review our spec?


This is great work! These are on my list of things to review in detail
soon, but given the Neutron sprint this week, I haven't had time yet.
I'll try to remedy that by the weekend.

Thanks!
Kyle


Best regards,
shihanzhang





At 2014-07-03 10:08:21, Ihar Hrachyshka ihrac...@redhat.com wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Oh, so you have the enhancement implemented? Great! Any numbers that
shows how much we gain from that?

/Ihar

On 03/07/14 02:49, shihanzhang wrote:

Hi, Miguel Angel Ajo! Yes, the ipset implementation is ready, today
I will modify my spec, when the spec is approved, I will commit the
codes as soon as possilbe!





At 2014-07-02 10:12:34, Miguel Angel Ajo majop...@redhat.com
wrote:


Nice Shihanzhang,

Do you mean the ipset implementation is ready, or just the
spec?.


For the SG group refactor, I don't worry about who does it, or
who takes the credit, but I believe it's important we address
this bottleneck during Juno trying to match nova's scalability.

Best regards, Miguel Ángel.


On 07/02/2014 02:50 PM, shihanzhang wrote:

hi Miguel Ángel and Ihar Hrachyshka, I agree with you that
split  the work in several specs, I have finished the work (
ipset optimization), you can do 'sg rpc optimization (without
fanout)'. as the third part(sg rpc optimization (with fanout)),
I think we need talk about it, because just using ipset to
optimize security group agent codes does not bring the best
results!

Best regards, shihanzhang.








At 2014-07-02 04:43:24, Ihar Hrachyshka ihrac...@redhat.com
wrote:

On 02/07/14 10:12, Miguel Angel Ajo wrote:


Shihazhang,



I really believe we need the RPC refactor done for this cycle,
and given the close deadlines we have (July 10 for spec
submission and July 20 for spec approval).



Don't you think it's going to be better to split the work in
several specs?



1) ipset optimization   (you) 2) sg rpc optimization (without
fanout) (me) 3) sg rpc optimization (with fanout) (edouard, you
, me)




This way we increase the chances of having part of this for the
Juno cycle. If we go for something too complicated is going to
take more time for approval.



I agree. And it not only increases chances to get at least some of
those highly demanded performance enhancements to get into Juno,
it's also the right thing to do (c). It's counterproductive to
put multiple vaguely related enhancements in single spec. This
would dim review focus and put us into position of getting
'all-or-nothing'. We can't afford that.

Let's leave one spec per enhancement. @Shihazhang, what do you
think?



Also, I proposed the details of 2, trying to bring awareness
on the topic, as I have been working with the scale lab in Red
Hat to find and understand those issues, I have a very good
knowledge of the problem and I believe I could make a very fast
advance on the issue at the RPC level.



Given that, I'd like to work on this specific part, whether or
not we split the specs, as it's something we believe critical
for neutron scalability and thus, *nova parity*.



I will start a separate spec for 2, later on, if you find it
ok, we keep them as separate ones, if you believe having just 1
spec (for 1  2) is going be safer for juno-* approval, then we
can incorporate my spec in yours, but then
add-ipset-to-security is not a good spec title to put all this
together.




Best regards, Miguel Ángel.




On 07/02/2014 03:37 AM, shihanzhang wrote:


hi Miguel Angel Ajo Pelayo! I agree with you and modify my
spes, but I will also optimization the RPC from security group
agent to neutron server. Now the modle is
'port[rule1,rule2...], port...', I will change it to 'port[sg1,
sg2..]', this can reduce the size of RPC

Re: [openstack-dev] [Nova] Scheduler split wrt Extensible Resource Tracking

2014-08-20 Thread Jay Pipes


On 08/20/2014 04:48 AM, Nikola Đipanov wrote:

On 08/20/2014 08:27 AM, Joe Gordon wrote:

On Aug 19, 2014 10:45 AM, Day, Phil philip@hp.com
mailto:philip@hp.com wrote:



-Original Message-
From: Nikola Đipanov [mailto:ndipa...@redhat.com

mailto:ndipa...@redhat.com]

Sent: 19 August 2014 17:50
To: openstack-dev@lists.openstack.org

mailto:openstack-dev@lists.openstack.org

Subject: Re: [openstack-dev] [Nova] Scheduler split wrt Extensible

Resource

Tracking

On 08/19/2014 06:39 PM, Sylvain Bauza wrote:

On the other hand, ERT discussion is decoupled from the scheduler
split discussion and will be delayed until Extensible Resource Tracker
owner (Paul Murray) is back from vacation.
In the mean time, we're considering new patches using ERT as
non-acceptable, at least until a decision is made about ERT.



Even though this was not officially agreed I think this is the least

we can do

under the circumstances.

A reminder that a revert proposal is up for review still, and I

consider it fair

game to approve, although it would be great if we could hear from

Paul first:


   https://review.openstack.org/115218


Given the general consensus seemed to be to wait some before deciding

what to do here, isn't putting the revert patch up for approval a tad
premature ?


There was a recent discussion about reverting patches, and from that
(but not only) my understanding is that we should revert whenever in doubt.


Right.

http://lists.openstack.org/pipermail/openstack-dev/2014-August/042728.html


Putting the patch back in is easy, and if proven wrong I'd be the first
to +2 it. As scary as they sound - I don't think reverts are a big deal.


Neither do I. I think it's more appropriate to revert quickly and then 
add it back after any discussions, per the above revert policy.




The RT may be not able to cope with all of the new and more complex

resource types we're now trying to schedule, and so it's not surprising
that the ERT can't fix that.  It does however address some specific use
cases that the current RT can't cope with,  the spec had a pretty
through review under the new process, and was discussed during the last
2 design summits.   It worries me that we're continually failing to make
even small and useful progress in this area.


Sylvain's approach of leaving the ERT in place so it can be used for

the use cases it was designed for while holding back on doing some of
the more complex things than might need either further work in the ERT,
or some more fundamental work in the RT (which feels like as L or M
timescales based on current progress) seemed pretty pragmatic to me.

++, I really don't like the idea of rushing the revert of a feature that
went through significant design discussion especially when the author is
away and cannot defend it.


Fair enough - I will WIP the revert until Phil is back. It's the right
thing to do seeing that he is away.


Well, it's as much (or more?) Paul Murray and Andrea Rosa :)


However - I don't agree with using the length of discussion around the
feature as a valid argument against reverting.


Neither do I.


I've supplied several technical arguments on the original thread to why
I think we should revert it, and would expect a discussion that either
refutes them, or provides alternative ways forward.

Saying 'but we talked about it at length' is the ultimate appeal to
imaginary authority and frankly not helping at all.


Agreed. Perhaps it's just my provocative nature, but I hear a lot of 
we've already decided/discussed this talk especially around the 
scheduler and RT stuff, and I don't think the argument holds much water. 
We should all be willing to reconsider design decisions and discussions 
when appropriate, and in the case of the RT, this discussion is timely 
and appropriate due to the push to split the scheduler out of Nova 
(prematurely IMO).


Best,
-jay


N.




Phil

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org

mailto:OpenStack-dev@lists.openstack.org

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-20 Thread Jay Pipes


Hi Thierry, thanks for the reply. Comments inline. :)

On 08/20/2014 06:32 AM, Thierry Carrez wrote:

Jay Pipes wrote:

[...] If either of the above answers is NO, then I believe the
Technical Committee should recommend that the integrated project be
removed from the integrated release.

HOWEVER, I *also* believe that the previously-integrated project
should not just be cast away back to Stackforge. I think the
project should remain in its designated Program and should remain
in the openstack/ code namespace. Furthermore, active, competing
visions and implementations of projects that address the Thing the
previously-integrated project addressed should be able to apply to
join the same Program, and *also* live in the openstack/
namespace.

All of these projects should be able to live in the Program, in
the openstack/ code namespace, for as long as the project is
actively developed, and let the contributor communities in these
competing projects *naturally* work to do any of the following:

* Pick a best-of-breed implementation from the projects that
address the same Thing * Combine code and efforts to merge the good
bits of multiple projects into one * Let multiple valid choices of
implementation live in the same Program with none of them being
blessed by the TC to be part of the integrated release


That would work if an OpenStack Program was just like a category
under which you can file projects. However, OpenStack programs are
not a competition category where we could let multiple competing
implementations fight it out for becoming the solution; they are
essentially just a team of people working toward a common goal,
having meetings and sharing/electing the same technical lead.

I'm not convinced you would set competing solutions for a fair
competition by growing them inside the same team (and under the same
PTL!) as the current mainstream/blessed option. How likely is the
Orchestration PTL to make the decision to drop Heat in favor of a
new contender ?


I don't believe the Programs are needed, as they are currently
structured. I don't really believe they serve any good purposes, and
actually serve to solidify positions of power, slanted towards existing
power centers, which is antithetical to a meritocratic community.

Furthermore, the structures we've built into the OpenStack community
governance has resulted in perverse incentives. There is this constant
struggle to be legitimized by being included in a Program, incubated,
and then included in the integrated release. Projects, IMO, should be
free to innovate in *any* area of OpenStack, including areas with
existing integrated projects. We should be more open, not less.


I'm also concerned with making a program a collection of competing
teams, rather than a single team sharing the same meetings and
electing the same leadership, working all together. I don't want the
teams competing to get a number of contributors that would let them
game the elections and take over the program leadership. I think such
a setup would just increase the political tension inside programs,
and we have enough of it already.


By prohibiting competition within a Program, you don't magically get rid
of the competition, though. :) The competition will continue to exist,
and divisions will continue to be increased among the people working on
the same general area. You can't force people to get in-line with a
project whose vision or architectural design principles they don't share.


If we want to follow your model, we probably would have to dissolve
programs as they stand right now, and have blessed categories on one
side, and teams on the other (with projects from some teams being
blessed as the current solution).


Why do we have to have blessed categories at all? I'd like to think of
a day when the TC isn't picking winners or losers at all. Level the
playing field and let the quality of the projects themselves determine
the winner in the space. Stop the incubation and graduation madness and 
change the role of the TC to instead play an advisory role to upcoming 
(and existing!) projects on the best ways to integrate with other 
OpenStack projects, if integration is something that is natural for the 
project to work towards.



That would leave the horizontal programs like Docs, QA or Infra,
where the team and the category are the same thing, as outliers again
(like they were before we did programs).


What is the purpose of having these programs, though? If it's just to 
have a PTL, then I think we need to reconsider the whole concept of 
Programs. We should not be putting in place structures that just serve 
to create centers of power. *Projects* will naturally find/elect/choose 
not to have one or more technical leads. Why should we limit entire 
categories of projects to having a single Lead person? What purpose does 
the role fill that could not be filled in a looser, more natural 
fashion? Since the TC is no longer composed of each integrated project 
PTL along

Re: [openstack-dev] [all] The future of the integrated release

2014-08-20 Thread Jay Pipes


On 08/20/2014 11:41 AM, Zane Bitter wrote:

On 19/08/14 10:37, Jay Pipes wrote:


By graduating an incubated project into the integrated release, the
Technical Committee is blessing the project as the OpenStack way to do
some thing. If there are projects that are developed *in the OpenStack
ecosystem* that are actively being developed to serve the purpose that
an integrated project serves, then I think it is the responsibility of
the Technical Committee to take another look at the integrated project
and answer the following questions definitively:

  a) Is the Thing that the project addresses something that the
Technical Committee believes the OpenStack ecosystem benefits from by
the TC making a judgement on what is the OpenStack way of addressing
that Thing.

and IFF the decision of the TC on a) is YES, then:

  b) Is the Vision and Implementation of the currently integrated
project the one that the Technical Committee wishes to continue to
bless as the the OpenStack way of addressing the Thing the project
does.


I disagree with part (b); projects are not code - projects, like Soylent
Green, are people.


Hey! Don't steal my slide content! :P

http://bit.ly/navigating-openstack-community (slide 3)

 So it's not critical that the implementation is the

one the TC wants to bless, what's critical is that the right people are
involved to get to an implementation that the TC would be comfortable
blessing over time. For example, everyone agrees that Ceilometer has
room for improvement, but any implication that the Ceilometer is not
interested in or driving towards those improvements (because of NIH or
whatever) is, as has been pointed out, grossly unfair to the Ceilometer
team.


I certainly have not made such an implication about Ceilometer. What I 
see in the Ceilometer space, though, is that there are clearly a number 
of *active* communities of OpenStack engineers developing code that 
crosses similar problem spaces. I think the TC blessing one of those 
communities before the market has had a chance to do a bit more 
natural filtering of quality is a barrier to innovation. I think having 
all of those separate teams able to contribute code to an openstack/ 
code namespace and naturally work to resolve differences and merge 
innovation is a better fit for a meritocracy.



I think the rest of your plan is a way of recognising this
appropriately, that the current implementation is actually not the
be-all and end-all of how the TC should view a project.


Yes, quite well said.

Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-20 Thread Jay Pipes


On 08/20/2014 05:06 PM, Chris Friesen wrote:

On 08/20/2014 07:21 AM, Jay Pipes wrote:

Hi Thierry, thanks for the reply. Comments inline. :)

On 08/20/2014 06:32 AM, Thierry Carrez wrote:

If we want to follow your model, we probably would have to dissolve
programs as they stand right now, and have blessed categories on one
side, and teams on the other (with projects from some teams being
blessed as the current solution).


Why do we have to have blessed categories at all? I'd like to think of
a day when the TC isn't picking winners or losers at all. Level the
playing field and let the quality of the projects themselves determine
the winner in the space. Stop the incubation and graduation madness and
change the role of the TC to instead play an advisory role to upcoming
(and existing!) projects on the best ways to integrate with other
OpenStack projects, if integration is something that is natural for the
project to work towards.


It seems to me that at some point you need to have a recommended way of
doing things, otherwise it's going to be *really hard* for someone to
bring up an OpenStack installation.


Why can't there be multiple recommended ways of setting up an OpenStack 
installation? Matter of fact, in reality, there already are multiple 
recommended ways of setting up an OpenStack installation, aren't there?


There's multiple distributions of OpenStack, multiple ways of doing 
bare-metal deployment, multiple ways of deploying different message 
queues and DBs, multiple ways of establishing networking, multiple open 
and proprietary monitoring systems to choose from, etc. And I don't 
really see anything wrong with that.



We already run into issues with something as basic as competing SQL
databases.


If the TC suddenly said Only MySQL will be supported, that would not 
mean that the greater OpenStack community would be served better. It 
would just unnecessarily take options away from deployers.


 If every component has several competing implementations and

none of them are official how many more interaction issues are going
to trip us up?


IMO, OpenStack should be about choice. Choice of hypervisor, choice of 
DB and MQ infrastructure, choice of operating systems, choice of storage 
vendors, choice of networking vendors.


If there are multiple actively-developed projects that address the same 
problem space, I think it serves our OpenStack users best to let the 
projects work things out themselves and let the cream rise to the top. 
If the cream ends up being one of those projects, so be it. If the cream 
ends up being a mix of both projects, so be it. The production community 
will end up determining what that cream should be based on what it 
deploys into its clouds and what input it supplies to the teams working 
on competing implementations.


And who knows... what works or is recommended by one deployer may not be 
what is best for another type of deployer and I believe we (the 
TC/governance) do a disservice to our user community by picking a winner 
in a space too early (or continuing to pick a winner in a clearly 
unsettled space).


Just my thoughts on the topic, as they've evolved over the years from 
being a pure developer, to doing QA, then deploy/ops work, and back to 
doing development on OpenStack...


Best,
-jay






___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-21 Thread Jay Pipes


On 08/21/2014 07:58 AM, Chris Dent wrote:

On Thu, 21 Aug 2014, Sean Dague wrote:


By blessing one team what we're saying is all the good ideas pool for
tackling this hard problem can only come from that one team.


This is a big part of this conversation that really confuses me. Who is
that one team?

I don't think it is that team that is being blessed, it is that
project space. That project space ought, if possible, have a team
made up of anyone who is interested. Within that umbrella both
the competition and cooperation that everyone wants can happen.

You're quite right Sean, there is a lot of gravity that comes from
needing to support and slowly migrate the existing APIs. That takes
up quite a lot of resources. It doesn't mean, however, that other
resources can't work on substantial improvements in cooperation with
the rest of the project. Gnocchi and the entire V3 concept in
ceilometer are a good example of this. Some folk are working on that
and some folk are working on maintaining and improving the old
stuff.

Some participants in this thread seem to be saying give some else a
chance. Surely nobody needs to be given the chance, they just need
to join the project and make some contributions? That is how this is
supposed to work isn't it?


Specifically for Ceilometer, many of the folks working on alternate 
implementations have contributed or are actively contributing to 
Ceilometer. Some have stopped contributing because of fundamental 
disagreements about the appropriateness of the Ceilometer architecture. 
Others have begun working on Gnocchi to address design issues, and 
others have joined efforts on Monasca, and others have continued work on 
Stacktach. Eoghan has done an admirable job of informing the TC about 
goings on in the Ceilometer community and being forthright about the 
efforts around Gnocchi. And there isn't any perceived animosity between 
the aforementioned contributor subteams. The point I've been making is 
that by the TC continuing to bless only the Ceilometer project as the 
OpenStack Way of Metering, I think we do a disservice to our users by 
picking a winner in a space that is clearly still unsettled.


Specifically for Triple-O, by making the Deployment program == Triple-O, 
the TC has picked the disk-image-based deployment of an undercloud 
design as The OpenStack Way of Deployment. And as I've said previously 
in this thread, I believe that the deployment space is similarly 
unsettled, and that it would be more appropriate to let the Chef 
cookbooks and Puppet modules currently sitting in the stackforge/ code 
namespace live in the openstack/ code namespace.


I recommended getting rid of the formal Program concept because I didn't 
think it was serving any purpose other than solidifying existing power 
centers and was inhibiting innovation by sending the signal of blessed 
teams/projects, instead of sending a signal of inclusion.


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-21 Thread Jay Pipes


On 08/20/2014 11:54 PM, Clint Byrum wrote:

Excerpts from Jay Pipes's message of 2014-08-20 14:53:22 -0700:

On 08/20/2014 05:06 PM, Chris Friesen wrote:

On 08/20/2014 07:21 AM, Jay Pipes wrote:

...snip

We already run into issues with something as basic as competing SQL
databases.


If the TC suddenly said Only MySQL will be supported, that would not
mean that the greater OpenStack community would be served better. It
would just unnecessarily take options away from deployers.


This is really where supported becomes the mutex binding us all. The
more supported options, the larger the matrix, the more complex a
user's decision process becomes.


I don't believe this is necessarily true.

A large chunk of users of OpenStack will deploy their cloud using one of 
the OpenStack distributions -- RDO, Ubuntu OpenStack, MOS, or one of the 
OpenStack appliances. For these users, they will select the options that 
their distribution offers (or makes for them).


Another chunk of users of OpenStack will deploy their cloud using things 
like the Chef cookbooks or Puppet modules on stackforge. For these 
users, they will select the options that the writers of those Puppet 
modules or Chef cookbooks have wired into the module or cookbook.


Another chunk of users of OpenStack will deploy their cloud by following 
the upstream installation documentation. This documentation currently 
focuses on the integrated projects, and so these users would only be 
deploying the projects that contributed excellent documentation and 
worked with distributors and packagers to make the installation and use 
of their project as easy as possible.


So, I think there is an argument to be made that packagers and deployers 
would have more decisions to make, but not necessarily end-users of 
OpenStack.



   If every component has several competing implementations and

none of them are official how many more interaction issues are going
to trip us up?


IMO, OpenStack should be about choice. Choice of hypervisor, choice of
DB and MQ infrastructure, choice of operating systems, choice of storage
vendors, choice of networking vendors.


Err, uh. I think OpenStack should be about users. If having 400 choices
means users are just confused, then OpenStack becomes nothing and
everything all at once. Choices should be part of the whole not when 1%
of the market wants a choice, but when 20%+ of the market _requires_
a choice.


I believe by picking winners in unsettled spaces, we add more to the 
confusion of users than having 1 option for doing something.



What we shouldn't do is harm that 1%'s ability to be successful. We should
foster it and help it grow, but we don't just pull it into the program and
say You're ALSO in OpenStack now!


I haven't been proposing that these competing projects would be in 
OpenStack now. I have been proposing that these projects live in the 
openstack/ code namespace, as these projects are 100% targeting 
OpenStack installations and users, and they are offering options to 
OpenStack deployers.


I hate the fact that the TC is deciding what is OpenStack.

IMO, we should be instead answering questions like does project X solve 
problem Y for OpenStack users? and can the design of project A be 
adapted to pull in good things from project B? and where can we advise 
project M to put resources that would most benefit OpenStack users?.


 and we also don't want to force those

users to make a hard choice because the better solution is not blessed.


But users are *already* forced to make these choices. They make these 
choices by picking an OpenStack distribution, or by necessity of a 
certain scale, or by their experience and knowledge base of a particular 
technology. Blessing one solution when there are multiple valid 
solutions does not suddenly remove the choice for users.



If there are multiple actively-developed projects that address the same
problem space, I think it serves our OpenStack users best to let the
projects work things out themselves and let the cream rise to the top.
If the cream ends up being one of those projects, so be it. If the cream
ends up being a mix of both projects, so be it. The production community
will end up determining what that cream should be based on what it
deploys into its clouds and what input it supplies to the teams working
on competing implementations.


I'm really not a fan of making it a competitive market. If a space has a
diverse set of problems, we can expect it will have a diverse set of
solutions that overlap. But that doesn't mean they both need to drive
toward making that overlap all-encompassing. Sometimes that happens and
it is good, and sometimes that happens and it causes horrible bloat.


Yes, I recognize the danger that choice brings. I just am more 
optimistic than you about our ability to handle choice. :)



And who knows... what works or is recommended by one deployer may not be
what is best for another type of deployer and I believe we (the
TC

Re: [openstack-dev] [all] [ptls] The Czar system, or how to scale PTLs

2014-08-22 Thread Jay Pipes


On 08/22/2014 08:33 AM, Thierry Carrez wrote:

Hi everyone,

We all know being a project PTL is an extremely busy job. That's because
in our structure the PTL is responsible for almost everything in a project:

- Release management contact
- Work prioritization
- Keeping bugs under control
- Communicate about work being planned or done
- Make sure the gate is not broken
- Team logistics (run meetings, organize sprints)
- ...

They end up being completely drowned in those day-to-day operational
duties, miss the big picture, can't help in development that much
anymore, get burnt out. Since you're either the PTL or not the PTL,
you're very alone and succession planning is not working that great either.

There have been a number of experiments to solve that problem. John
Garbutt has done an incredible job at helping successive Nova PTLs
handling the release management aspect. Tracy Jones took over Nova bug
management. Doug Hellmann successfully introduced the concept of Oslo
liaisons to get clear point of contacts for Oslo library adoption in
projects. It may be time to generalize that solution.

The issue is one of responsibility: the PTL is ultimately responsible
for everything in a project. If we can more formally delegate that
responsibility, we can avoid getting up to the PTL for everything, we
can rely on a team of people rather than just one person.

Enter the Czar system: each project should have a number of liaisons /
official contacts / delegates that are fully responsible to cover one
aspect of the project. We need to have Bugs czars, which are responsible
for getting bugs under control. We need to have Oslo czars, which serve
as liaisons for the Oslo program but also as active project-local oslo
advocates. We need Security czars, which the VMT can go to to progress
quickly on plugging vulnerabilities. We need release management czars,
to handle the communication and process with that painful OpenStack
release manager. We need Gate czars to serve as first-line-of-contact
getting gate issues fixed... You get the idea.

Some people can be czars of multiple areas. PTLs can retain some czar
activity if they wish. Czars can collaborate with their equivalents in
other projects to share best practices. We just need a clear list of
areas/duties and make sure each project has a name assigned to each.

Now, why czars ? Why not rely on informal activity ? Well, for that
system to work we'll need a lot of people to step up and sign up for
more responsibility. Making them czars makes sure that effort is
recognized and gives them something back. Also if we don't formally
designate people, we can't really delegate and the PTL will still be
directly held responsible. The Release management czar should be able to
sign off release SHAs without asking the PTL. The czars and the PTL
should collectively be the new project drivers.

At that point, why not also get rid of the PTL ? And replace him with a
team of czars ? If the czar system is successful, the PTL should be
freed from the day-to-day operational duties and will be able to focus
on the project health again. We still need someone to keep an eye on the
project-wide picture and coordinate the work of the czars. We need
someone to pick czars, in the event multiple candidates sign up. We also
still need someone to have the final say in case of deadlocked issues.

People say we don't have that many deadlocks in OpenStack for which the
PTL ultimate power is needed, so we could get rid of them. I'd argue
that the main reason we don't have that many deadlocks in OpenStack is
precisely *because* we have a system to break them if they arise. That
encourages everyone to find a lazy consensus. That part of the PTL job
works. Let's fix the part that doesn't work (scaling/burnout).


I think the czars approach is sensible and seems to have worked pretty 
well in a couple projects so far.


And, since I work for a software company with Russian origin, I support 
the term czar as well ;)


On the topic of whether a PTL is still needed once a czar system is put 
in place, I think that should be left up to each individual project to 
decide.


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ironic] [TripleO] How to gracefully quiesce a box?

2014-08-22 Thread Jay Pipes

On 08/22/2014 01:48 PM, Clint Byrum wrote:

It has been brought to my attention that Ironic uses the biggest hammer
in the IPMI toolbox to control chassis power:

https://git.openstack.org/cgit/openstack/ironic/tree/ironic/drivers/modules/ipminative.py#n142

Which is

ret = ipmicmd.set_power('off', wait)

This is the most abrupt form, where the system power should be flipped
off at a hardware level. The short press on the power button would be
'shutdown' instead of 'off'.

I also understand that this has been brought up before, and that the
answer given was SSH in and shut it down yourself. I can respect that
position, but I have run into a bit of a pickle using it. Observe:

- ssh box.ip poweroff
- poll ironic until power state is off.
- This is a race. Ironic is asserting the power. As soon as it sees
that the power is off, it will turn it back on.

- ssh box.ip halt
- NO way to know that this has worked. Once SSH is off and the network
stack is gone, I cannot actually verify that the disks were
unmounted properly, which is the primary area of concern that I
have.

This is particulary important if I'm issuing a rebuild + preserve
ephemeral, as it is likely I will have lots of I/O going on, and I want
to make sure that it is all quiesced before I reboot to replace the
software and reboot.

Perhaps I missed something. If so, please do educate me on how I can
achieve this without hacking around it. Currently my workaround is to
manually unmount the state partition, which is something system shutdown
is supposed to do and may become problematic if system processes are
holding it open.

It seems to me that Ironic should at least try to use the graceful
shutdown. There can be a timeout, but it would need to be something a user
can disable so if graceful never works we never just dump the power on the
box. Even a journaled filesystem will take quite a bit to do a full fsck.

The inability to gracefully shutdown in a reasonable amount of time
is an error state really, and I need to go to the box and inspect it,
which is precisely the reason we have ERROR states.

What about placing a runlevel script in /etc/init.d/ and symlinking it
to run on shutdown -- i.e. /etc/rc0.d/? You could run fsync or unmount
the state partition in that script which would ensure disk state was
quiesced, no?

Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] [ptls] The Czar system, or how to scale PTLs

2014-08-24 Thread Jay Pipes


On 08/23/2014 06:35 PM, Clint Byrum wrote:

 I agree as well. PTL is a servant of the community, as any good leader
is. If the PTL feels they have to drop the hammer, or if an impass is
reached where they are asked to, it is because they have failed to get
everyone communicating effectively, not because that's their job.


The problem isn't really that teams are not communicating effectively, 
nor is the problem to do with some deficit of a PTL in either putting 
the hammer down or failing to figure out common ground.


The issue in my opinion and my experience is that there are multiple 
valid ways of doing something (say, deployment or metering or making 
toast) and the TC and our governing structure has decided to pick 
winners in spaces instead of having a big tent and welcoming different 
solutions and projects into the OpenStack fold. We pick winners and by 
doing so, we are exclusionary, and this exclusivity does not benefit our 
user community, but rather just gives it fewer options.


IMHO, the TC should become an advisory team that recommends to 
interested project teams ways in which they can design and architect 
their projects to integrate well with other projects in the OpenStack 
community, and design their projects for the scale, stability and 
requirements (such as multi-tenancy) that an open cloud software 
ecosystem demands.


Just my two cents,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Server Groups - remove VM from group?

2014-08-25 Thread Jay Pipes


On 08/25/2014 11:10 AM, Joe Cropper wrote:

Hello,

Is our long-term vision to allow a VMs to be dynamically added/removed
from a group?  That is, unless I'm overlooking something, it appears
that you can only add a VM to a server group at VM boot time and
effectively remove it by deleting the VM?

Just curious if this was a design point, or merely an approach at a
staged implementation [that might welcome some additions]?  :)


See here:

http://lists.openstack.org/pipermail/openstack-dev/2014-April/033746.html

If I had my druthers, I would revert the whole extension.

-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Server Groups - remove VM from group?

2014-08-25 Thread Jay Pipes


On 08/25/2014 11:31 AM, Joe Cropper wrote:

Thanks Jay.  Those are the same types of questions I was pondering as
well when debating how someone might use this.  I think what we have
is fine for a first pass, but that's what I was poking at... whether
some of the abilities to add/remove members dynamically could exist
(e.g., I no longer want this VM to have an anti-affinity policy
relative to the others, etc.).


I guess what I was getting at is that I think the whole interface is 
flawed and it's not worth putting in the effort to make it slightly less 
flawed.


Best,
-jay


- Joe

On Mon, Aug 25, 2014 at 10:16 AM, Jay Pipes jaypi...@gmail.com wrote:

On 08/25/2014 11:10 AM, Joe Cropper wrote:


Hello,

Is our long-term vision to allow a VMs to be dynamically added/removed
from a group?  That is, unless I'm overlooking something, it appears
that you can only add a VM to a server group at VM boot time and
effectively remove it by deleting the VM?

Just curious if this was a design point, or merely an approach at a
staged implementation [that might welcome some additions]?  :)



See here:

http://lists.openstack.org/pipermail/openstack-dev/2014-April/033746.html

If I had my druthers, I would revert the whole extension.

-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] What does NASA not using OpenStack mean to OS's future

2014-08-25 Thread Jay Pipes


On 08/25/2014 12:08 PM, Aryeh Friedman wrote:

http://www.quora.com/Why-would-the-creators-of-OpenStack-the-market-leader-in-cloud-computing-platforms-refuse-to-use-it-and-use-AWS-instead


Would you mind please not posting to the developer's mailing list 
inflammatory random web pages?


Thanks,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-26 Thread Jay Pipes


On 08/25/2014 03:50 PM, Adam Lawson wrote:

I recognize I'm joining the discussion late but I've been following the
dialog fairly closely and want to offer my perspective FWIW. I have a
lot going through my head, not sure how to get it all out there so I'll
do a brain dump, get some feedback and apologize in advance.

One the things I like most about Openstack is its incredible flexibility
- a modular architecture where certain programs/capabilities can be
leveraged for a specific install - or not, and ideally the rest of the
feature suite remains functional irrespective of a program status. When
it comes to a program being approved as part of Openstack Proper (pardon
my stepping over that discussion), I think a LOT of what is being
discussed here touches on what Openstack will ultimately be about and
what it won't.

With products like Cloudstack floating around consuming market share,
all I see is Citrix. A product billed as open source but so closely
aligned with one vendor that it almost doesn't matter. They have matured
decision structure, UI polish and organized support but they don't have
community. Not like us anyway. With Openstack the moral authority to
call ourselves the champions of open cloud and with that, we have
competing interests that make our products better. We don't have a
single vendor (yet) that dictates whether something will happen or not.
The maturity of the Openstack products themselves are driven by a
community of consumers where the needs are accommodated rather than sold.

A positive than comes with such a transparent design pipeline is the
increased capability for design agility and accommodating changes when a
change is needed. But I'm becoming increasingly disappointed at the
mount of attention being given to whether one product is blessed by
Openstack or not. In a modular design, these programs should be
interchangeable with only a couple exceptions. Does being blessed really
matter? The consensus I've garnered in this thread is the desperate need
for the consuming community's continued involvement. What I
/haven't/ heard much about is how Openstack can standardize how these
programs - blessed or not - can interact with the rest of the suite to
the extent they adhere to the correct inputs/outputs which makes them
functional. Program status is irrelevant.

I guess when it comes right down to it, I love what Openstack is and
where we're going and I especially appreciate these discussions. But I'm
disappointed at the number of concerns I've been reading about things
that ultimately don't matter (like being blessed, about who has the
power, etc) and I have concerns we lose sight what this is all about to
the point that the vision for Openstack gets clouded.

We have a good thing and no project can accommodate every request so a
decision must be made as to what is 'included' and what is 'supported'.
But with modularity, it really doesn't matter one iota if a program is
blessed in the Openstack integrated release cycle or not.


Couldn't agree with you more, Adam. I believe if OpenStack is to succeed 
in the future, our community and our governance structure needs to 
embrace the tremendous growth in scope that OpenStack's success to-date 
has generated. The last thing we should do, IMO, is reverse course and 
act like a single-vendor product in order to tame the wildlings.


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra] [neutron] [tc] Neutron Incubator workflow

2014-08-27 Thread Jay Pipes


On 08/26/2014 07:09 PM, James E. Blair wrote:

Hi,

After reading https://wiki.openstack.org/wiki/Network/Incubator I have
some thoughts about the proposed workflow.

We have quite a bit of experience and some good tools around splitting
code out of projects and into new projects.  But we don't generally do a
lot of importing code into projects.  We've done this once, to my
recollection, in a way that preserved history, and that was with the
switch to keystone-lite.

It wasn't easy; it's major git surgery and would require significant
infra-team involvement any time we wanted to do it.

However, reading the proposal, it occurred to me that it's pretty clear
that we expect these tools to be able to operate outside of the Neutron
project itself, to even be releasable on their own.  Why not just stick
with that?  In other words, the goal of this process should be to create
separate projects with their own development lifecycle that will
continue indefinitely, rather than expecting the code itself to merge
into the neutron repo.

This has advantages in simplifying workflow and making it more
consistent.  Plus it builds on known integration mechanisms like APIs
and python project versions.

But more importantly, it helps scale the neutron project itself.  I
think that a focused neutron core upon which projects like these can
build on in a reliable fashion would be ideal.


Despite replies to you saying that certain branches of Neutron 
development work are special unicorns, I wanted to say I *fully* support 
your above statement.


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] refactoring of resize/migrate

2014-08-27 Thread Jay Pipes


On 08/27/2014 06:41 AM, Markus Zoeller wrote:

The review of the spec to blueprint hot-resize has several comments
about the need of refactoring the existing code base of resize and
migrate before the blueprint could be considered (see [1]).
I'm interested in the result of the blueprint therefore I want to offer
my support. How can I participate?

[1] https://review.openstack.org/95054


Are you offering support to refactor resize/migrate, or are you offering 
support to work only on the hot-resize functionality?


I'm very much interested in refactoring the resize/migrate 
functionality, and would appreciate any help and insight you might have. 
Unfortunately, such a refactoring:


a) Must start in Kilo
b) Begins with un-crufting the simply horrible, inconsistent, and 
duplicative REST API and public behaviour of the resize and migrate actions


In any case, I'm happy to start the conversation about this going in 
about a month or so, or whenever Kilo blueprints open up. Until then, 
we're pretty much working on reviews for already-approved blueprints and 
bug fixing.


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [zaqar] [marconi] Juno Performance Testing (Round 1)

2014-08-28 Thread Jay Pipes


On 08/26/2014 05:41 PM, Kurt Griffiths wrote:

 * uWSGI + gevent
 * config: http://paste.openstack.org/show/100592/
 * app.py: http://paste.openstack.org/show/100593/


Hi Kurt!

Thanks for posting the benchmark configuration and results. Good stuff :)

I'm curious about what effect removing http-keepalive from the uWSGI 
config would make. AIUI, for systems that need to support lots and lots 
of random reads/writes from lots of tenants, using keepalive sessions 
would cause congestion for incoming new connections, and may not be 
appropriate for such systems.


Totally not a big deal; really, just curious if you'd run one or more of 
the benchmarks with keepalive turned off and what results you saw.


Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra] [neutron] [tc] Neutron Incubator workflow

2014-08-28 Thread Jay Pipes


On 08/27/2014 04:28 PM, Kevin Benton wrote:

What are you talking about? The only reply was from me clarifying that
one of the purposes of the incubator was for components of neutron that
are experimental but are intended to be merged.


Right. The special unicorns.

 In that case it might

not make sense to have a life cycle of their own in another repo
indefinitely.


The main reasons these experimental components don't make sense to 
live in their own repo indefinitely are:


a) Neutron's design doesn't make it easy or straightforward to 
build/layer other things on top of it, or:


b) The experimental piece of code intends to replace whole-hog a large 
chunk of Neutron's existing codebase, or:


c) The experimental piece of code relies so heavily on inconsistent, 
unversioned internal interface and plugin calls that it cannot be 
designed externally due to the fragility of those interfaces


Fixing a) is the solution to these problems. An incubator area where 
experimental components can live will just continue to mask the true 
problem domain, which is that Neutron's design is cumbersome to build on 
top of, and its cross-component interfaces need to be versioned, made 
consistent, and cleaned up to use versioned data structures instead of 
passing random nested dicts of randomly-prefixed string key/values.


Frankly, we're going through a similar problem in Nova right now. There 
is a group of folks who believe that separating the nova-scheduler code 
into the Gantt project will magically make placement decision code and 
solver components *easier* to work on (because the pace of coding can be 
increased if there wasn't that pesky nova-core review process). But this 
is not correct, IMO. Separating out the scheduler into its own project 
before internal interfaces and data structures are cleaned up and 
versioned will just lead to greater technical debt and an increase in 
frustration on the part of Nova developers and scheduler developers alike.


-jay


On Wed, Aug 27, 2014 at 11:52 AM, Jay Pipes jaypi...@gmail.com
mailto:jaypi...@gmail.com wrote:

On 08/26/2014 07:09 PM, James E. Blair wrote:

Hi,

After reading
https://wiki.openstack.org/__wiki/Network/Incubator
https://wiki.openstack.org/wiki/Network/Incubator I have
some thoughts about the proposed workflow.

We have quite a bit of experience and some good tools around
splitting
code out of projects and into new projects.  But we don't
generally do a
lot of importing code into projects.  We've done this once, to my
recollection, in a way that preserved history, and that was with the
switch to keystone-lite.

It wasn't easy; it's major git surgery and would require significant
infra-team involvement any time we wanted to do it.

However, reading the proposal, it occurred to me that it's
pretty clear
that we expect these tools to be able to operate outside of the
Neutron
project itself, to even be releasable on their own.  Why not
just stick
with that?  In other words, the goal of this process should be
to create
separate projects with their own development lifecycle that will
continue indefinitely, rather than expecting the code itself to
merge
into the neutron repo.

This has advantages in simplifying workflow and making it more
consistent.  Plus it builds on known integration mechanisms like
APIs
and python project versions.

But more importantly, it helps scale the neutron project itself.  I
think that a focused neutron core upon which projects like these can
build on in a reliable fashion would be ideal.


Despite replies to you saying that certain branches of Neutron
development work are special unicorns, I wanted to say I *fully*
support your above statement.

Best,
-jay



_
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--
Kevin Benton


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] [neutron] Specs for K release

2014-08-28 Thread Jay Pipes


On 08/28/2014 12:50 PM, Michael Still wrote:

On Thu, Aug 28, 2014 at 6:53 AM, Daniel P. Berrange berra...@redhat.com wrote:

On Thu, Aug 28, 2014 at 11:51:32AM +, Alan Kavanagh wrote:

How to do we handle specs that have slipped through the cracks
and did not make it for Juno?


Rebase the proposal so it is under the 'kilo' directory path
instead of 'juno' and submit it for review again. Make sure
to keep the ChangeId line intact so people see the history
of any review comments in the earlier Juno proposal.


Yes, but...

I think we should talk about tweaking the structure of the juno
directory. Something like having proposed, approved, and implemented
directories. That would provide better signalling to operators about
what we actually did, what we thought we'd do, and what we didn't do.


I think this would be really useful.

-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Design Summit reloaded

2014-08-28 Thread Jay Pipes


On 08/27/2014 11:34 AM, Doug Hellmann wrote:


On Aug 27, 2014, at 8:51 AM, Thierry Carrez thie...@openstack.org
wrote:


Hi everyone,

I've been thinking about what changes we can bring to the Design
Summit format to make it more productive. I've heard the feedback
from the mid-cycle meetups and would like to apply some of those
ideas for Paris, within the constraints we have (already booked
space and time). Here is something we could do:

Day 1. Cross-project sessions / incubated projects / other
projects

I think that worked well last time. 3 parallel rooms where we can
address top cross-project questions, discuss the results of the
various experiments we conducted during juno. Don't hesitate to
schedule 2 slots for discussions, so that we have time to come to
the bottom of those issues. Incubated projects (and maybe other
projects, if space allows) occupy the remaining space on day 1, and
could occupy pods on the other days.


If anything, I’d like to have fewer cross-project tracks running
simultaneously. Depending on which are proposed, maybe we can make
that happen. On the other hand, cross-project issues is a big theme
right now so maybe we should consider devoting more than a day to
dealing with them.


I agree with Doug here. I'd almost say having a single cross-project 
room, with serialized content would be better than 3 separate 
cross-project tracks. By nature, the cross-project sessions will attract 
developers that work or are interested in a set of projects that looks 
like a big Venn diagram. By having 3 separate cross-project tracks, we 
would increase the likelihood that developers would once more have to 
choose among simultaneous sessions that they have equal interest in. For 
Infra and QA folks, this likelihood is even greater...


I think I'd prefer a single cross-project track on the first day.


Day 2 and Day 3. Scheduled sessions for various programs

That's our traditional scheduled space. We'll have a 33% less
slots available. So, rather than trying to cover all the scope, the
idea would be to focus those sessions on specific issues which
really require face-to-face discussion (which can't be solved on
the ML or using spec discussion) *or* require a lot of user
feedback. That way, appearing in the general schedule is very
helpful. This will require us to be a lot stricter on what we
accept there and what we don't -- we won't have space for courtesy
sessions anymore, and traditional/unnecessary sessions (like my
traditional release schedule one) should just move to the
mailing-list.


The message I’m getting from this change in available space is that
we need to start thinking about and writing up ideas early, so teams
can figure out which upcoming specs need more discussion and which
don’t.


++

Also, I think as a community we need to get much better about saying 
No for certain things. No to sessions that don't have much specific 
details to them. No to blueprints that don't add much functionality that 
cannot be widely used or taken advantage of. No to specs that don't have 
a narrow-enough scope, etc.


I also think we need to be better at saying Yes to other things, 
though... but that's a different thread ;)



Day 4. Contributors meetups

On the last day, we could try to split the space so that we can
conduct parallel midcycle-meetup-like contributors gatherings, with
no time boundaries and an open agenda. Large projects could get a
full day, smaller projects would get half a day (but could continue
the discussion in a local bar). Ideally that meetup would end with
some alignment on release goals, but the idea is to make the best
of that time together to solve the issues you have. Friday would
finish with the design summit feedback session, for those who are
still around.


This is a good compromise between needing to allow folks to move
around between tracks (including speaking at the conference) and
having a large block of unstructured time for deep dives.


Agreed.

Best,
-jay


I think this proposal makes the best use of our setup: discuss
clear cross-project issues, address key specific topics which need
face-to-face time and broader attendance, then try to replicate
the success of midcycle meetup-like open unscheduled time to
discuss whatever is hot at this point.

There are still details to work out (is it possible split the
space, should we use the usual design summit CFP website to
organize the scheduled time...), but I would first like to have
your feedback on this format. Also if you have alternative
proposals that would make a better use of our 4 days, let me know.

Cheers,

-- Thierry Carrez (ttx)

___ OpenStack-dev
mailing list OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___ OpenStack-dev mailing
list OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Design Summit reloaded

2014-08-28 Thread Jay Pipes


On 08/28/2014 02:21 PM, Sean Dague wrote:

On 08/28/2014 01:58 PM, Jay Pipes wrote:

On 08/27/2014 11:34 AM, Doug Hellmann wrote:


On Aug 27, 2014, at 8:51 AM, Thierry Carrez thie...@openstack.org
wrote:


Hi everyone,

I've been thinking about what changes we can bring to the Design
Summit format to make it more productive. I've heard the feedback
from the mid-cycle meetups and would like to apply some of those
ideas for Paris, within the constraints we have (already booked
space and time). Here is something we could do:

Day 1. Cross-project sessions / incubated projects / other
projects

I think that worked well last time. 3 parallel rooms where we can
address top cross-project questions, discuss the results of the
various experiments we conducted during juno. Don't hesitate to
schedule 2 slots for discussions, so that we have time to come to
the bottom of those issues. Incubated projects (and maybe other
projects, if space allows) occupy the remaining space on day 1, and
could occupy pods on the other days.


If anything, I’d like to have fewer cross-project tracks running
simultaneously. Depending on which are proposed, maybe we can make
that happen. On the other hand, cross-project issues is a big theme
right now so maybe we should consider devoting more than a day to
dealing with them.


I agree with Doug here. I'd almost say having a single cross-project
room, with serialized content would be better than 3 separate
cross-project tracks. By nature, the cross-project sessions will attract
developers that work or are interested in a set of projects that looks
like a big Venn diagram. By having 3 separate cross-project tracks, we
would increase the likelihood that developers would once more have to
choose among simultaneous sessions that they have equal interest in. For
Infra and QA folks, this likelihood is even greater...

I think I'd prefer a single cross-project track on the first day.


So the fallout of that is there will be 6 or 7 cross-project slots for
the design summit. Maybe that's the right mix if the TC does a good job
picking the top 5 things we want accomplished from a cross project
standpoint during the cycle. But it's going to have to be a pretty
directed pick. I think last time we had 21 slots, and with a couple of
doubling up that gave 19 sessions. (about 30 - 35 proposals for that
slot set).


I'm not sure that would be a bad thing :)

I think one of the reasons the mid-cycles have been successful is that 
they have adequately limited the scope of discussions and I think by 
doing our homework by fully vetting and voting on cross-project sessions 
and being OK with saying No, not this time., we will be more 
productive than if we had 20+ cross-project sessions.


Just my two cents, though..

-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Is the BP approval process broken?

2014-08-28 Thread Jay Pipes


On 08/27/2014 09:04 PM, Dugger, Donald D wrote:

I’ll try and not whine about my pet project but I do think there is a
problem here.  For the Gantt project to split out the scheduler there is
a crucial BP that needs to be implemented (
https://review.openstack.org/#/c/89893/ ) and, unfortunately, the BP has
been rejected and we’ll have to try again for Kilo.  My question is did
we do something wrong or is the process broken?

Note that we originally proposed the BP on 4/23/14, went through 10
iterations to the final version on 7/25/14 and the final version got
three +1s and a +2 by 8/5.  Unfortunately, even after reaching out to
specific people, we didn’t get the second +2, hence the rejection.

I understand that reviews are a burden and very hard but it seems wrong
that a BP with multiple positive reviews and no negative reviews is
dropped because of what looks like indifference.


I would posit that this is not actually indifference. The reason that 
there may not have been 1 +2 from a core team member may very well have 
been that the core team members did not feel that the blueprint's 
priority was high enough to put before other work, or that the core team 
members did have the time to comment on the spec (due to them not 
feeling the blueprint had the priority to justify the time to do a full 
review).


Note that I'm not a core drivers team member.

Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Design Summit reloaded

2014-08-28 Thread Jay Pipes


On 08/28/2014 03:31 PM, Sean Dague wrote:

On 08/28/2014 03:06 PM, Jay Pipes wrote:

On 08/28/2014 02:21 PM, Sean Dague wrote:

On 08/28/2014 01:58 PM, Jay Pipes wrote:

On 08/27/2014 11:34 AM, Doug Hellmann wrote:


On Aug 27, 2014, at 8:51 AM, Thierry Carrez thie...@openstack.org
wrote:


Hi everyone,

I've been thinking about what changes we can bring to the Design
Summit format to make it more productive. I've heard the feedback
from the mid-cycle meetups and would like to apply some of those
ideas for Paris, within the constraints we have (already booked
space and time). Here is something we could do:

Day 1. Cross-project sessions / incubated projects / other
projects

I think that worked well last time. 3 parallel rooms where we can
address top cross-project questions, discuss the results of the
various experiments we conducted during juno. Don't hesitate to
schedule 2 slots for discussions, so that we have time to come to
the bottom of those issues. Incubated projects (and maybe other
projects, if space allows) occupy the remaining space on day 1, and
could occupy pods on the other days.


If anything, I’d like to have fewer cross-project tracks running
simultaneously. Depending on which are proposed, maybe we can make
that happen. On the other hand, cross-project issues is a big theme
right now so maybe we should consider devoting more than a day to
dealing with them.


I agree with Doug here. I'd almost say having a single cross-project
room, with serialized content would be better than 3 separate
cross-project tracks. By nature, the cross-project sessions will attract
developers that work or are interested in a set of projects that looks
like a big Venn diagram. By having 3 separate cross-project tracks, we
would increase the likelihood that developers would once more have to
choose among simultaneous sessions that they have equal interest in. For
Infra and QA folks, this likelihood is even greater...

I think I'd prefer a single cross-project track on the first day.


So the fallout of that is there will be 6 or 7 cross-project slots for
the design summit. Maybe that's the right mix if the TC does a good job
picking the top 5 things we want accomplished from a cross project
standpoint during the cycle. But it's going to have to be a pretty
directed pick. I think last time we had 21 slots, and with a couple of
doubling up that gave 19 sessions. (about 30 - 35 proposals for that
slot set).


I'm not sure that would be a bad thing :)

I think one of the reasons the mid-cycles have been successful is that
they have adequately limited the scope of discussions and I think by
doing our homework by fully vetting and voting on cross-project sessions
and being OK with saying No, not this time., we will be more
productive than if we had 20+ cross-project sessions.

Just my two cents, though..


I'm not sure it would be a bad thing either. I just wanted to be
explicit about what we are saying the cross projects sessions are for in
this case: the 5 key cross project activities the TC believes should be
worked on this next cycle.


Yes.


The other question is if we did that what's running in competition to
cross project day? Is it another free form pod day for people not
working on those things?


It could be a pod day, sure. Or just an extended hallway session day... :)

-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Is the BP approval process broken?

2014-08-28 Thread Jay Pipes


On 08/28/2014 04:05 PM, Chris Friesen wrote:

On 08/28/2014 01:44 PM, Jay Pipes wrote:

On 08/27/2014 09:04 PM, Dugger, Donald D wrote:



I understand that reviews are a burden and very hard but it seems wrong
that a BP with multiple positive reviews and no negative reviews is
dropped because of what looks like indifference.


I would posit that this is not actually indifference. The reason that
there may not have been 1 +2 from a core team member may very well have
been that the core team members did not feel that the blueprint's
priority was high enough to put before other work, or that the core team
members did have the time to comment on the spec (due to them not
feeling the blueprint had the priority to justify the time to do a full
review).


The overall scheduler-lib Blueprint is marked with a high priority
at http://status.openstack.org/release/;.  Hopefully that would apply
to sub-blueprints as well.


a) There are no sub-blueprints to that scheduler-lib blueprint

b) If there were sub-blueprints, that does not mean that they would 
necessarily take the same priority as their parent blueprint


c) There's no reason priorities can't be revisited when necessary

-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Is the BP approval process broken?

2014-08-28 Thread Jay Pipes



On 08/28/2014 04:42 PM, Dugger, Donald D wrote:

I would contend that that right there is an indication that there's a
problem with the process.  You submit a BP and then you have no idea
of what is happening and no way of addressing any issues.  If the
priority is wrong I can explain why I think the priority should be
higher, getting stonewalled leaves me with no idea what's wrong and
no way to address any problems.

I think, in general, almost everyone is more than willing to adjust
proposals based upon feedback.  Tell me what you think is wrong and
I'll either explain why the proposal is correct or I'll change it to
address the concerns.


In many of the Gantt IRC meetings as well as the ML, I and others have 
repeatedly raised concerns about the scheduler split being premature and 
not a priority compared to the cleanup of the internal interfaces around 
the resource tracker and scheduler. This feedback was echoed in the 
mid-cycle meetup session as well. Sylvain and I have begun the work of 
cleaning up those interfaces and fixing the bugs around non-versioned 
data structures and inconsistent calling interfaces in the scheduler and 
resource tracker. Progress is being made towards these things.



Trying to deal with silence is really hard and really frustrating.
Especially given that we're not supposed to spam the mailing it's
really hard to know what to do.  I don't know the solution but we
need to do something.  More core team members would help, maybe
something like an automatic timeout where BPs/patches with no
negative scores and no activity for a week get flagged for special
handling.


Yes, I think flagging blueprints for special handling would be a good 
thing. Keep in mind, though, that there are an enormous number of 
proposed specifications, with the vast majority of folks only caring 
about their own proposed specs, and very few doing reviews on anything 
other than their own patches or specific area of interest.


Doing reviews on other folks' patches and blueprints would certainly 
help in this regard. If cores only see someone contributing to a small, 
isolated section of the code or only to their own blueprints/patches, 
they generally tend to implicitly down-play that person's reviews in 
favor of patches/blueprints from folks that are reviewing non-related 
patches and contributing to reduce the total review load.


I understand your frustration about the silence, but the silence from 
core team members may actually be a loud statement about where their 
priorities are.


Best,
-jay


I feel we need to change the process somehow.

-- Don Dugger Censeo Toto nos in Kansa esse decisse. - D. Gale Ph:
303/443-3786

-Original Message- From: Jay Pipes
[mailto:jaypi...@gmail.com] Sent: Thursday, August 28, 2014 1:44 PM
To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev]
[nova] Is the BP approval process broken?

On 08/27/2014 09:04 PM, Dugger, Donald D wrote:

I'll try and not whine about my pet project but I do think there is
a problem here.  For the Gantt project to split out the scheduler
there is a crucial BP that needs to be implemented (
https://review.openstack.org/#/c/89893/ ) and, unfortunately, the
BP has been rejected and we'll have to try again for Kilo.  My
question is did we do something wrong or is the process broken?

Note that we originally proposed the BP on 4/23/14, went through
10 iterations to the final version on 7/25/14 and the final version
got three +1s and a +2 by 8/5.  Unfortunately, even after reaching
out to specific people, we didn't get the second +2, hence the
rejection.

I understand that reviews are a burden and very hard but it seems
wrong that a BP with multiple positive reviews and no negative
reviews is dropped because of what looks like indifference.


I would posit that this is not actually indifference. The reason that
there may not have been 1 +2 from a core team member may very well
have been that the core team members did not feel that the
blueprint's priority was high enough to put before other work, or
that the core team members did have the time to comment on the spec
(due to them not feeling the blueprint had the priority to justify
the time to do a full review).

Note that I'm not a core drivers team member.

Best, -jay


___ OpenStack-dev mailing
list OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___ OpenStack-dev mailing
list OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [qa] Lack of consistency in returning response from tempest clients

2014-08-29 Thread Jay Pipes


On 08/29/2014 10:19 AM, David Kranz wrote:

While reviewing patches for moving response checking to the clients, I
noticed that there are places where client methods do not return any value.
This is usually, but not always, a delete method. IMO, every rest client
method should return at least the response. Some services return just
the response for delete methods and others return (resp, body). Does any
one object to cleaning this up by just making all client methods return
resp, body? This is mostly a change to the clients. There were only a
few places where a non-delete  method was returning just a body that was
used in test code.


Sounds good to me. :)

-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Is the BP approval process broken?

2014-08-29 Thread Jay Pipes


On 08/29/2014 12:25 PM, Zane Bitter wrote:

On 28/08/14 17:02, Jay Pipes wrote:

I understand your frustration about the silence, but the silence from
core team members may actually be a loud statement about where their
priorities are.


I don't know enough about the Nova review situation to say if the
process is broken or not. But I can say that if passive-aggressively
ignoring people is considered a primary communication channel, something
is definitely broken.


Nobody is ignoring anyone. There have ongoing conversations about the 
scheduler and Gantt, and those conversations haven't resulted in all the 
decisions that Don would like. That is unfortunate, but it's not a sign 
of a broken process.


-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] nova backup not working in stable/icehouse?

2014-08-29 Thread Jay Pipes


On 08/29/2014 02:48 AM, Preston L. Bannister wrote:

Looking to put a proper implementation of instance backup into
OpenStack. Started by writing a simple set of baseline tests and running
against the stable/icehouse branch. They failed!

https://github.com/dreadedhill-work/openstack-backup-scripts

Scripts and configuration are in the above. Simple tests.

At first I assumed there was a configuration error in my Devstack ...
but at this point I believe the errors are in fact in OpenStack. (Also I
have rather more colorful things to say about what is and is not logged.)

Try to backup bootable Cinder volumes attached to instances ... and all
fail. Try to backup instances booted from images, and all-but-one fail
(without logged errors, so far as I see).

Was concerned about preserving existing behaviour (as I am currently
hacking the Nova backup API), but ... if the existing is badly broken,
this may not be a concern. (Makes my job a bit simpler.)

If someone is using nova backup successfully (more than one backup at
a time), I *would* rather like to know!

Anyone with different experience?


IMO, the create_backup API extension should be removed from the Compute 
API. It's completely unnecessary and backups should be the purview of 
external (to Nova) scripts or configuration management modules. This API 
extension is essentially trying to be a Cloud Cron, which is 
inappropriate for the Compute API, IMO.


-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [qa][all][Heat] Packaging of functional tests

2014-09-04 Thread Jay Pipes


On 08/29/2014 05:15 PM, Zane Bitter wrote:

On 29/08/14 14:27, Jay Pipes wrote:

On 08/26/2014 10:14 AM, Zane Bitter wrote:

Steve Baker has started the process of moving Heat tests out of the
Tempest repository and into the Heat repository, and we're looking for
some guidance on how they should be packaged in a consistent way.
Apparently there are a few projects already packaging functional tests
in the package projectname.tests.functional (alongside
projectname.tests.unit for the unit tests).

That strikes me as odd in our context, because while the unit tests run
against the code in the package in which they are embedded, the
functional tests run against some entirely different code - whatever
OpenStack cloud you give it the auth URL and credentials for. So these
tests run from the outside, just like their ancestors in Tempest do.

There's all kinds of potential confusion here for users and packagers.
None of it is fatal and all of it can be worked around, but if we
refrain from doing the thing that makes zero conceptual sense then there
will be no problem to work around :)

I suspect from reading the previous thread about In-tree functional
test vision that we may actually be dealing with three categories of
test here rather than two:

* Unit tests that run against the package they are embedded in
* Functional tests that run against the package they are embedded in
* Integration tests that run against a specified cloud

i.e. the tests we are now trying to add to Heat might be qualitatively
different from the projectname.tests.functional suites that already
exist in a few projects. Perhaps someone from Neutron and/or Swift can
confirm?

I'd like to propose that tests of the third type get their own top-level
package with a name of the form projectname-integrationtests (second
choice: projectname-tempest on the principle that they're essentially
plugins for Tempest). How would people feel about standardising that
across OpenStack?


By its nature, Heat is one of the only projects that would have
integration tests of this nature. For Nova, there are some functional
tests in nova/tests/integrated/ (yeah, badly named, I know) that are
tests of the REST API endpoints and running service daemons (the things
that are RPC endpoints), with a bunch of stuff faked out (like RPC
comms, image services, authentication and the hypervisor layer itself).
So, the integrated tests in Nova are really not testing integration
with other projects, but rather integration of the subsystems and
processes inside Nova.

I'd support a policy that true integration tests -- tests that test the
interaction between multiple real OpenStack service endpoints -- be left
entirely to Tempest. Functional tests that test interaction between
internal daemons and processes to a project should go into
/$project/tests/functional/.

For Heat, I believe tests that rely on faked-out other OpenStack
services but stress the interaction between internal Heat
daemons/processes should be in /heat/tests/functional/ and any tests the
rely on working, real OpenStack service endpoints should be in Tempest.


Well, the problem with that is that last time I checked there was
exactly one Heat scenario test in Tempest because tempest-core doesn't
have the bandwidth to merge all (any?) of the other ones folks submitted.

So we're moving them to openstack/heat for the pure practical reason
that it's the only way to get test coverage at all, rather than concerns
about overloading the gate or theories about the best venue for
cross-project integration testing.


Hmm, speaking of passive aggressivity...

Where can I see a discussion of the Heat integration tests with Tempest 
QA folks? If you give me some background on what efforts have been made 
already and what is remaining to be reviewed/merged/worked on, then I 
can try to get some resources dedicated to helping here.


I would greatly prefer just having a single source of integration 
testing in OpenStack, versus going back to the bad ol' days of everybody 
under the sun rewriting their own.


Note that I'm not talking about functional testing here, just the 
integration testing...


Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db]A proposal for DB read/write separation

2014-09-04 Thread Jay Pipes


On 09/02/2014 07:15 AM, Duncan Thomas wrote:

On 11 August 2014 19:26, Jay Pipes jaypi...@gmail.com wrote:


The above does not really make sense for MySQL Galera/PXC clusters *if only
Galera nodes are used in the cluster*. Since Galera is synchronously
replicated, there's no real point in segregating writers from readers, IMO.
Better to just spread the write AND read load equally among all Galera
cluster nodes.


Unfortunately it is possible to get bitten by the difference between
'synchronous' and 'virtually synchronous' in practice.


Not in my experience. The thing that has bitten me in practice are 
Galera's lack of support for SELECT FOR UPDATE, which is used 
extensively in some of the OpenStack projects. Instead of taking a 
write-intent lock on one or more record gaps (which is what InnoDB does 
in the case of a SELECT FOR UPDATE on a local node), Galera happily 
replicates DML statements to all other nodes in the cluster. If two of 
those nodes attempt to modify the same row or rows in a table, then the 
working set replication will fail to certify, which results in a 
certification timeout, which is then converted to an InnoDB deadlock error.


It's the difference between hanging around waiting on a local node for 
the transaction that called SELECT FOR UPDATE to complete and release 
the write-intent locks on a set of table rows versus hanging around 
waiting for the InnoDB deadlock/lock timeout to bubble up from the 
working set replication certification (which typically is longer than 
the time taken to lock the rows in a single transaction, and therefore 
causes thundering herd issues with the conductor attempting to retry 
stuff due to the use of the @retry_on_deadlock decorator which is so 
commonly used everywhere)


FWIW, I've cc'd a real expert on the matter. Peter, feel free to 
clarify, contradict, or just ignore me :)


Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Jay Pipes


On 09/04/2014 11:32 AM, Vladik Romanovsky wrote:

+1

I very much agree with Dan's the propsal.

I am concerned about difficulties we will face with merging
patches that spreads accross various regions: manager, conductor, scheduler, 
etc..
However, I think, this is a small price to pay for having a more focused teams.

IMO, we will stiil have to pay it, the moment the scheduler will separate.


There will be more pain the moment the scheduler separates, IMO, 
especially with its current design and interfaces.


-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [qa][all][Heat] Packaging of functional tests

2014-09-04 Thread Jay Pipes


On 09/04/2014 11:32 AM, Steven Hardy wrote:

On Thu, Sep 04, 2014 at 10:45:59AM -0400, Jay Pipes wrote:

On 08/29/2014 05:15 PM, Zane Bitter wrote:

On 29/08/14 14:27, Jay Pipes wrote:

On 08/26/2014 10:14 AM, Zane Bitter wrote:

Steve Baker has started the process of moving Heat tests out of the
Tempest repository and into the Heat repository, and we're looking for
some guidance on how they should be packaged in a consistent way.
Apparently there are a few projects already packaging functional tests
in the package projectname.tests.functional (alongside
projectname.tests.unit for the unit tests).

That strikes me as odd in our context, because while the unit tests run
against the code in the package in which they are embedded, the
functional tests run against some entirely different code - whatever
OpenStack cloud you give it the auth URL and credentials for. So these
tests run from the outside, just like their ancestors in Tempest do.

There's all kinds of potential confusion here for users and packagers.
None of it is fatal and all of it can be worked around, but if we
refrain from doing the thing that makes zero conceptual sense then there
will be no problem to work around :)

I suspect from reading the previous thread about In-tree functional
test vision that we may actually be dealing with three categories of
test here rather than two:

* Unit tests that run against the package they are embedded in
* Functional tests that run against the package they are embedded in
* Integration tests that run against a specified cloud

i.e. the tests we are now trying to add to Heat might be qualitatively
different from the projectname.tests.functional suites that already
exist in a few projects. Perhaps someone from Neutron and/or Swift can
confirm?

I'd like to propose that tests of the third type get their own top-level
package with a name of the form projectname-integrationtests (second
choice: projectname-tempest on the principle that they're essentially
plugins for Tempest). How would people feel about standardising that
across OpenStack?


By its nature, Heat is one of the only projects that would have
integration tests of this nature. For Nova, there are some functional
tests in nova/tests/integrated/ (yeah, badly named, I know) that are
tests of the REST API endpoints and running service daemons (the things
that are RPC endpoints), with a bunch of stuff faked out (like RPC
comms, image services, authentication and the hypervisor layer itself).
So, the integrated tests in Nova are really not testing integration
with other projects, but rather integration of the subsystems and
processes inside Nova.

I'd support a policy that true integration tests -- tests that test the
interaction between multiple real OpenStack service endpoints -- be left
entirely to Tempest. Functional tests that test interaction between
internal daemons and processes to a project should go into
/$project/tests/functional/.

For Heat, I believe tests that rely on faked-out other OpenStack
services but stress the interaction between internal Heat
daemons/processes should be in /heat/tests/functional/ and any tests the
rely on working, real OpenStack service endpoints should be in Tempest.


Well, the problem with that is that last time I checked there was
exactly one Heat scenario test in Tempest because tempest-core doesn't
have the bandwidth to merge all (any?) of the other ones folks submitted.

So we're moving them to openstack/heat for the pure practical reason
that it's the only way to get test coverage at all, rather than concerns
about overloading the gate or theories about the best venue for
cross-project integration testing.


Hmm, speaking of passive aggressivity...

Where can I see a discussion of the Heat integration tests with Tempest QA
folks? If you give me some background on what efforts have been made already
and what is remaining to be reviewed/merged/worked on, then I can try to get
some resources dedicated to helping here.


We recieved some fairly strong criticism from sdague[1] earlier this year,
at which point we were  already actively working on improving test coverage
by writing new tests for tempest.

Since then, several folks, myself included, commited very significant
amounts of additional effort to writing more tests for tempest, with some
success.

Ultimately the review latency and overhead involved in constantly rebasing
changes between infrequent reviews has resulted in slow progress and
significant frustration for those attempting to contribute new test cases.

It's been clear for a while that tempest-core have significant bandwidth
issues, as well as not necessarily always having the specific domain
expertise to thoroughly review some tests related to project-specific
behavior or functionality.

So it was with some relief that we saw the proposal[2] to move the burden
for reviewing project test-cases to the project teams, who will presumably
be more motivated to do the reviews, and have the knowledge

< 1 2 3 4 5 6 7 8 9 10 >

101 - 200 of 2014 matches

Mail list logo