Hi Sergey, AFAIU, the problem is that when Nova was designed initially, it had no notion of shared storage (e.g. Ceph), so all the resources were considered to be local to compute nodes. In that case each total value was a sum of values per node. But as we see now, that doesn't work well with Ceph, when the storage is actually shared and doesn't belong to any particular node.
It seems we've got two different, but related problems here: 1) resource tracking is incorrect, as nodes shouldn't report info about storage when shared storage is used (fixing this by reporting e.g. 0 values would require changes to nova-scheduler) 2) total storage is calculated incorrectly as we just sum the values reported by each node >From my point of view, in order to fix both, it might make sense for nova-api/nova-scheduler to actually know, if shared storage is used and access Ceph directly (otherwise, it's not clear, which compute node we should ask for this data, and what exactly we should ask, as we don't actually know if the storage is shared in the context of nova-api/nova-scheduler processes). Thanks, Roman On Mon, Nov 24, 2014 at 3:45 PM, Sergey Nikitin <[email protected]> wrote: > Hi, > As you know we can use Ceph as ephemeral storage in nova. But we have some > problems with its integration. First of all, total storage of compute nodes > is calculated incorrectly. (more details here > https://bugs.launchpad.net/nova/+bug/1387812). I want to fix this problem. > Now size of total storage is only a sum of storage of all compute nodes. And > information about the total storage is got directly from db. > (https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L663-L691). > To fix the problem we should check type of using storage. If type of storage > is RBD we should get information about total storage directly from Ceph > storage. > I proposed a patch (https://review.openstack.org/#/c/132084/) which should > fix this problem, but I got the fair comment that we shouldn't check type of > storage on the API layer. > > The other problem is that information about size of compute node incorrect > too. Now size of each node equal to size of whole Ceph cluster. > > On one hand it is good to do not check type of storage on the API layer, on > the other hand there are some reasons to check it on API layer: > 1. It would be useful for live migration because now a user has to send > information about storage with API request. > 2. It helps to fix problem with total storage. > 3. It helps to fix problem with size of compute nodes. > > So I want to ask you: "Is this a good idea to get information about type of > storage on API layer? If no - Is there are any ideas to get correct > information about Ceph storage?" > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
