Re: [openstack-dev] [nova] Shared storage support
On 03/02/2015 06:24 PM, Jay Pipes wrote: On 02/25/2015 06:41 AM, Daniel P. Berrange wrote: On Wed, Feb 25, 2015 at 02:08:32PM +, Gary Kotton wrote: I understand that this is a high or critical bug but I think that we need to discuss more on it and try have a more robust model. What I'm not seeing from the bug description is just what part of the scheduler needs the ability to have total summed disk across every host in the cloud. The scheduler does not need to know this information at all. Actually, there's a valid reason for the scheduler to know this. If I want to schedule 5 instances, each with 10GB disk, and I have 5 compute nodes each reporting 30 GB of space, it really does matter to the scheduler whether those compute nodes are all on a single shared storage device or if they've all got separate storage. The scheduler is always operating off stale data (as reported by the compute nodes some time ago) plus its own predictions. If its predictions don't match the actual behaviour then its decisions are going to be wrong. Chris __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Shared storage support
-Original Message- From: Jay Pipes Sent Monday, March 02, 2015 16:24 On 02/25/2015 06:41 AM, Daniel P. Berrange wrote: On Wed, Feb 25, 2015 at 02:08:32PM +, Gary Kotton wrote: I understand that this is a high or critical bug but I think that we need to discuss more on it and try have a more robust model. What I'm not seeing from the bug description is just what part of the scheduler needs the ability to have total summed disk across every host in the cloud. The scheduler does not need to know this information at all. One might say that a cloud administrator would want to know the total free disk space available in their cloud -- or at least get notified once the total free space falls below some threshold. IMO, there are better ways of accomplishing such a capacity-management task: use an NPRE/monitoring check that simply does a `df` or similar command every so often against the actual filesystem backend. IMHO, this isn't something that needs to be fronted by a management/admin-only REST API that needs to iterate over a potentially huge number of compute nodes just to enable some pretty graphical front-end that shows some pie chart of available disk space. [Rockyg] ++ Scheduler doesn't need to know anything about the individual compute nodes attached to *the same* shared storage to do placement. Scheduler can't increase or decrease the physical amount of storage available to the set of nodes. The hardware monitor for the shared storage provides the total amount of disk on the system, the amount already used and the amount still unused. Anywhere the scheduler starts a new vm in this node set will have the same amount of disk available or not. What is the actual bad functional behaviour that results from this bug that means it is a high priority issue to fix ? The higher priority thing would be to remove the wonky os-hypervisors REST API extension and its related cruft. This API extension is fatally flawed in a number of ways, including assumptions about things such as underlying providers of disk/volume resources and misleading relationships between the servicegroup API and the compute nodes table. [Rockyg] IMO the most important piece of information from OpenStack sw for an operator with a set of nodes sharing a storage backend is: what is the current total commitment (over commitment more likely) of the storage capacity on the set of nodes attached. And that results in a simple go/no-go for starting another vm on the set, or sending a warning/error that the storage is over-committed and get more. --Rocky Best, -jay __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Shared storage support
On 02/25/2015 06:41 AM, Daniel P. Berrange wrote: On Wed, Feb 25, 2015 at 02:08:32PM +, Gary Kotton wrote: I understand that this is a high or critical bug but I think that we need to discuss more on it and try have a more robust model. What I'm not seeing from the bug description is just what part of the scheduler needs the ability to have total summed disk across every host in the cloud. The scheduler does not need to know this information at all. One might say that a cloud administrator would want to know the total free disk space available in their cloud -- or at least get notified once the total free space falls below some threshold. IMO, there are better ways of accomplishing such a capacity-management task: use an NPRE/monitoring check that simply does a `df` or similar command every so often against the actual filesystem backend. IMHO, this isn't something that needs to be fronted by a management/admin-only REST API that needs to iterate over a potentially huge number of compute nodes just to enable some pretty graphical front-end that shows some pie chart of available disk space. What is the actual bad functional behaviour that results from this bug that means it is a high priority issue to fix ? The higher priority thing would be to remove the wonky os-hypervisors REST API extension and its related cruft. This API extension is fatally flawed in a number of ways, including assumptions about things such as underlying providers of disk/volume resources and misleading relationships between the servicegroup API and the compute nodes table. Best, -jay __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Shared storage support
Actually I have similar idea, an plan to work on it at L by a nova-spec (is it worth a spec?). But this idea not come from this bug, it's come from other cases: 1. Currently we need specified 'on_shared_storage' and 'block_migration' when evacuate and live_migration. After we tracking the shared storage, we needn't user specified those parameter. Also scheduler can have priority to choice the host have shared-storage with previous host. 2. Currently nova compute won't release resources for stopped instance, and won't rescheduler when start the stopped instance. To implement this, it need check the instance is on shared storage or not. That make the code very complex. After scheduler tracking the shared-storage, we can implement this smart. There is an option specified whether rescheduler stopped instance when instance isn't on shared storage, because block migration is waste. 3. Other intelligent scheduling. The basic idea is add new column in compute_node table, and the new column store an ID that identifying a storage. If two compute nodes have same storage id, that means the two nodes on the shared storage. There will be different way to generate the ID for different type storage, like NFS, ceph, lvm Thanks Alex 2015-02-25 22:08 GMT+08:00 Gary Kotton gkot...@vmware.com: Hi, There is an issue with the statistics reported when a nova compute driver has shared storage attached. That is, there may be more than one compute node reporting on the shared storage. A patch has been posted - https://review.openstack.org/#/c/155184. The direction here was to add a extra parameter to the dictionary that the driver returns for the resource utilization. The DB statistics calculation would take this into account and then do calculations accordingly. I am not really in favor of the approach for a number of reasons: 1. Over the last few cycles we have been making a move to trying to better define data structures and models that we use. More specifically we have been moving to object support 2. A change in the DB layer may break this support. 3. We are trying to have versioning of various blobs of data that are passed around My thinking is that the resource tracker should be aware that the compute node has shared storage and the changes done there. I do not think that the compute node should rely on the changes being done in the DB layer – that may be on a different host and even run a different version. I understand that this is a high or critical bug but I think that we need to discuss more on it and try have a more robust model. Thanks Gary __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev