Re: [openstack-dev] [nova] Shared storage support

2015-03-11 Thread Chris Friesen

On 03/02/2015 06:24 PM, Jay Pipes wrote:

On 02/25/2015 06:41 AM, Daniel P. Berrange wrote:

On Wed, Feb 25, 2015 at 02:08:32PM +, Gary Kotton wrote:

I understand that this is a high or critical bug but I think that
we need to discuss more on it and try have a more robust model.


What I'm not seeing from the bug description is just what part of
the scheduler needs the ability to have total summed disk across
every host in the cloud.


The scheduler does not need to know this information at all.


Actually, there's a valid reason for the scheduler to know this.

If I want to schedule 5 instances, each with 10GB disk, and I have 5 compute 
nodes each reporting 30 GB of space, it really does matter to the scheduler 
whether those compute nodes are all on a single shared storage device or if 
they've all got separate storage.


The scheduler is always operating off stale data (as reported by the compute 
nodes some time ago) plus its own predictions.  If its predictions don't match 
the actual behaviour then its decisions are going to be wrong.


Chris


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Shared storage support

2015-03-02 Thread Rochelle Grober


-Original Message-
From: Jay Pipes Sent Monday, March 02, 2015 16:24

On 02/25/2015 06:41 AM, Daniel P. Berrange wrote:
 On Wed, Feb 25, 2015 at 02:08:32PM +, Gary Kotton wrote:
 I understand that this is a high or critical bug but I think that
 we need to discuss more on it and try have a more robust model.

 What I'm not seeing from the bug description is just what part of
 the scheduler needs the ability to have total summed disk across
 every host in the cloud.

The scheduler does not need to know this information at all. One might 
say that a cloud administrator would want to know the total free disk 
space available in their cloud -- or at least get notified once the 
total free space falls below some threshold. IMO, there are better ways 
of accomplishing such a capacity-management task: use an NPRE/monitoring 
check that simply does a `df` or similar command every so often against 
the actual filesystem backend.

IMHO, this isn't something that needs to be fronted by a 
management/admin-only REST API that needs to iterate over a potentially 
huge number of compute nodes just to enable some pretty graphical 
front-end that shows some pie chart of available disk space.
 
[Rockyg] ++  Scheduler doesn't need to know anything about the individual 
compute nodes attached to *the same* shared storage to do placement.  Scheduler 
can't increase or decrease the physical amount of storage available to the set 
of nodes. The hardware monitor for the shared storage provides the total amount 
of disk on the system, the amount already used and the amount still unused.  
Anywhere the scheduler starts a new vm in this node set will have the same 
amount of disk available or not.

 What is the actual bad functional behaviour that results from this
 bug that means it is a high priority issue to fix ?

The higher priority thing would be to remove the wonky os-hypervisors 
REST API extension and its related cruft. This API extension is fatally 
flawed in a number of ways, including assumptions about things such as 
underlying providers of disk/volume resources and misleading 
relationships between the servicegroup API and the compute nodes table.

[Rockyg] IMO the most important piece of information from OpenStack sw for an 
operator with a set of nodes sharing a storage backend is: what is the current 
total commitment (over commitment more likely) of the storage capacity on the 
set of nodes attached.  And that results in a simple go/no-go for starting 
another vm on the set, or sending a warning/error that the storage is 
over-committed and get more.

--Rocky



Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Shared storage support

2015-03-02 Thread Jay Pipes

On 02/25/2015 06:41 AM, Daniel P. Berrange wrote:

On Wed, Feb 25, 2015 at 02:08:32PM +, Gary Kotton wrote:

I understand that this is a high or critical bug but I think that
we need to discuss more on it and try have a more robust model.


What I'm not seeing from the bug description is just what part of
the scheduler needs the ability to have total summed disk across
every host in the cloud.


The scheduler does not need to know this information at all. One might 
say that a cloud administrator would want to know the total free disk 
space available in their cloud -- or at least get notified once the 
total free space falls below some threshold. IMO, there are better ways 
of accomplishing such a capacity-management task: use an NPRE/monitoring 
check that simply does a `df` or similar command every so often against 
the actual filesystem backend.


IMHO, this isn't something that needs to be fronted by a 
management/admin-only REST API that needs to iterate over a potentially 
huge number of compute nodes just to enable some pretty graphical 
front-end that shows some pie chart of available disk space.



What is the actual bad functional behaviour that results from this
bug that means it is a high priority issue to fix ?


The higher priority thing would be to remove the wonky os-hypervisors 
REST API extension and its related cruft. This API extension is fatally 
flawed in a number of ways, including assumptions about things such as 
underlying providers of disk/volume resources and misleading 
relationships between the servicegroup API and the compute nodes table.


Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Shared storage support

2015-02-25 Thread Alex Xu
Actually I have similar idea, an plan to work on it at L by a nova-spec (is
it worth a spec?).

But this idea not come from this bug, it's come from other cases:

1. Currently we need specified 'on_shared_storage' and 'block_migration'
when evacuate and live_migration. After we tracking the shared storage, we
needn't user specified those parameter. Also scheduler can have priority to
choice the host have shared-storage with previous host.

2. Currently nova compute won't release resources for stopped instance, and
won't rescheduler when start the stopped instance. To implement this, it
need check the instance is on shared storage or not. That make the code
very complex. After scheduler tracking the shared-storage, we can implement
this smart. There is an option specified whether rescheduler stopped
instance when instance isn't on shared storage, because block migration is
waste.

3. Other intelligent scheduling.


The basic idea is add new column in compute_node table, and the new column
store an ID that identifying a storage. If two compute nodes have same
storage id, that means the two nodes on the shared storage. There will be
different way to generate the ID for different type storage, like NFS,
ceph, lvm

Thanks
Alex
2015-02-25 22:08 GMT+08:00 Gary Kotton gkot...@vmware.com:

  Hi,
 There is an issue with the statistics reported when a nova compute driver
 has shared storage attached. That is, there may be more than one compute
 node reporting on the shared storage. A patch has been posted -
 https://review.openstack.org/#/c/155184. The direction here was to add a
 extra parameter to the dictionary that the driver returns for the resource
 utilization. The DB statistics calculation would take this into account and
 then do calculations accordingly.
 I am not really in favor of the approach for a number of reasons:

1. Over the last few cycles we have been making a move to trying to
better define data structures and models that we use. More specifically we
have been moving to object support
2. A change in the DB layer may break this support.
3. We are trying to have versioning of various blobs of data that are
passed around

 My thinking is that the resource tracker should be aware that the compute
 node has shared storage and the changes done there. I do not think that the
 compute node should rely on the changes being done in the DB layer – that
 may be on a different host and even run a different version.

  I understand that this is a high or critical bug but I think that we
 need to discuss more on it and try have a more robust model.
 Thanks
 Gary

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev