Re: [openstack-dev] Nova quota statistics counting issue
On 4/14/2016 3:07 PM, Andrew Laski wrote: On Wed, Apr 13, 2016, at 12:27 PM, Dmitry Stepanenko wrote: Hi Team, I worked on nova quota statistics issue (https://bugs.launchpad.net/nova/+bug/1284424) happenning when nova-* processes are restarted during removing instances and was able to reproduce it. For repro I used devstack and started nova-api and nova-compute in separate screen windows. For killing them I used ctrl+c. As I found this issue happened if nova-* processes are killed after instance was deleted but right before quota commit procedure finishes. We discussed these results with Markus Zoeller and decided that even though killing nova processes is a bit exotic event, this still should be fixed because quotas counting affects billing and very important for us. +1. This is very important to get right. And while killing Nova processes is exotic during normal operation it could happen for upgrades and that should not cause quota issues. So, we need to introduce some mechanism that will prevent us from reaching inconsistent states in terms of quotas. In other words, this mechanism should work in such a way that both instance create/remove operation and quota usage recount operation happened or not happened together. There's been some discussion around this, and there are other ML threads somewhat discussing it in the context of moving quota enforcement into a centralized service/library. There are a couple of approaches that could be taken for tackling quotas, but a larger issue is that we have no good way of knowing if some change helps the situation. What we need before making any changes is a functional test that reproduces the issue. Once that is in place I would love to see the removal of the quota_usages table and reservations and have quota be based on actual usage represented in the instances table. But there are a lot of other viewpoints and I think work in this area is going to have to start making small incremental improvements. Any ideas how to do that properly? Kind regards, Dmitry OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev I've tried to start that here [1] but it needs work. I have a messier local version too that was (I think) reproducing a failure, but because it's a weird race condition mess, it's kind of hard to test and know when to assert the thing and stop the test. Maybe I'll just push up the latest WIP of what I have locally and then someone else can take it over if they want. [1] https://review.openstack.org/#/c/293800/ -- Thanks, Matt Riedemann __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Nova quota statistics counting issue
On Wed, Apr 13, 2016, at 12:27 PM, Dmitry Stepanenko wrote: > Hi Team, > I worked on nova quota statistics issue > (https://bugs.launchpad.net/nova/+bug/1284424) happenning when nova-* > processes are restarted during removing instances and was able to > reproduce it. For repro I used devstack and started nova-api and nova- > compute in separate screen windows. For killing them I used ctrl+c. As > I found this issue happened if nova-* processes are killed after > instance was deleted but right before quota commit procedure finishes. > We discussed these results with Markus Zoeller and decided that even > though killing nova processes is a bit exotic event, this still > should be fixed because quotas counting affects billing and very > important for us. +1. This is very important to get right. And while killing Nova processes is exotic during normal operation it could happen for upgrades and that should not cause quota issues. > So, we need to introduce some mechanism that will prevent us from > reaching inconsistent states in terms of quotas. In other words, this > mechanism should work in such a way that both instance create/remove > operation and quota usage recount operation happened or not happened > together. There's been some discussion around this, and there are other ML threads somewhat discussing it in the context of moving quota enforcement into a centralized service/library. There are a couple of approaches that could be taken for tackling quotas, but a larger issue is that we have no good way of knowing if some change helps the situation. What we need before making any changes is a functional test that reproduces the issue. Once that is in place I would love to see the removal of the quota_usages table and reservations and have quota be based on actual usage represented in the instances table. But there are a lot of other viewpoints and I think work in this area is going to have to start making small incremental improvements. > Any ideas how to do that properly? > Kind regards, > Dmitry > - > > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: OpenStack-dev- > requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Nova quota statistics counting issue
For what is worth neutron employs "resource trackers" which conceptually do something similar to nova quota usage statistics. Before starting any transaction that can potentially change usage for a given resource, the quota enforcement mechanism checks for a "dirty" marker on the resource tracker. If that marker is present, usage data for that resource are calculated from the DB table for the resource. If not, current usage is employed for quota enforcement and the "dirty" flag is set. This means that if the process dies in the middle of a transaction, the next transaction will rebuild the correct usage count from the DB. Salvatore On 14 April 2016 at 14:08, Timofei Durakovwrote: > Hi, > > I think it would be ok to store persistently quota details on compute > side, as was discussed during mitaka mid-cycle[1] for migrations[2]. So if > compute service fails we could restore state and update quota after compute > restart. > > Timofey > > [1] - https://etherpad.openstack.org/p/mitaka-nova-priorities-tracking > [2] - https://review.openstack.org/#/c/291161/5/nova/compute/background.py > > > > > On Wed, Apr 13, 2016 at 7:27 PM, Dmitry Stepanenko < > dstepane...@mirantis.com> wrote: > >> Hi Team, >> >> I worked on nova quota statistics issue ( >> https://bugs.launchpad.net/nova/+bug/1284424) happenning when nova-* >> processes are restarted during removing instances and was able to reproduce >> it. For repro I used devstack and started nova-api and nova-compute in >> separate screen windows. For killing them I used ctrl+c. As I found this >> issue happened if nova-* processes are killed after instance was deleted >> but right before quota commit procedure finishes. >> >> We discussed these results with Markus Zoeller and decided that even >> though killing nova processes is a bit exotic event, this still should be >> fixed because quotas counting affects billing and very important for us. >> >> So, we need to introduce some mechanism that will prevent us from >> reaching inconsistent states in terms of quotas. In other words, this >> mechanism should work in such a way that both instance create/remove >> operation and quota usage recount operation happened or not happened >> together. >> >> Any ideas how to do that properly? >> >> Kind regards, >> Dmitry >> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Nova quota statistics counting issue
Hi, I think it would be ok to store persistently quota details on compute side, as was discussed during mitaka mid-cycle[1] for migrations[2]. So if compute service fails we could restore state and update quota after compute restart. Timofey [1] - https://etherpad.openstack.org/p/mitaka-nova-priorities-tracking [2] - https://review.openstack.org/#/c/291161/5/nova/compute/background.py On Wed, Apr 13, 2016 at 7:27 PM, Dmitry Stepanenkowrote: > Hi Team, > > I worked on nova quota statistics issue ( > https://bugs.launchpad.net/nova/+bug/1284424) happenning when nova-* > processes are restarted during removing instances and was able to reproduce > it. For repro I used devstack and started nova-api and nova-compute in > separate screen windows. For killing them I used ctrl+c. As I found this > issue happened if nova-* processes are killed after instance was deleted > but right before quota commit procedure finishes. > > We discussed these results with Markus Zoeller and decided that even > though killing nova processes is a bit exotic event, this still should be > fixed because quotas counting affects billing and very important for us. > > So, we need to introduce some mechanism that will prevent us from reaching > inconsistent states in terms of quotas. In other words, this mechanism > should work in such a way that both instance create/remove operation and > quota usage recount operation happened or not happened together. > > Any ideas how to do that properly? > > Kind regards, > Dmitry > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev