Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
Punishing benign users as a defense against (potentially) malicious users sounds like a bad strategy. This should not be a zero-sum game. On 10/28/2013 02:49 PM, Joshua Harlow wrote: Sure, convergence model is great and likely how it has to be done. Its just a question of what is that convergence model :) I agree that its bad customer service to say 'yes u tried to delete it but I am charging u anyway' but I think the difference is that the user actually still has access to those resources when they are not completed deletion (due to say a network partition). So this makes it a nice feature for malicious users to take advantage of, freeing there quota while still have access to the resources that previously existed under that quota. I'd sure like that if I was a malicious user (free stuff!). Quotas are as u said 'records of intentions' but they also permit/deny access to further resources, and its the further resources that are the problem, not the record of intention (which at its simplest is just a write-ahead-log). What is stopping that write-ahead-log from being used at/in the billing 'system' and removing 'charges' for deletes that have not completed (if this is how a deployer wants to operate)? IMHO, I think this all goes back to having a well defined state-machine in nova (and elsewhere), where that state-machine can be altered to have states that may say prefer consistency vs user happiness. On 10/28/13 9:29 AM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Joshua Harlow's message of 2013-10-28 09:01:44 -0700: Except I think the CAP theorem would say that u can't accurately give back there quota under thing like network partitions. If nova-compute and the message queue have a network partition then u can release there quota but can't actually delete there vms. I would actually prefer to not release there quota, but then this should be a deployer decision and not a one size fits all decision (IMHO). CAP encourages convergence models to satisfy problems with consistency. Quotas and records of allocated resources are records of intention and we can converge the physical resources with the expressed intentions later. The speed with which you do that is part of the cost of network partition failures and should be considered when assessing and mitigating risk. It is really bad customer service to tell somebody Yes I know you've asked me to stop charging you, but my equipment has failed so I MUST keep charging you. Reminds me of that gym membership I tried to cancel... _TRIED_. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev smime.p7s Description: S/MIME Cryptographic Signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
On 25 October 2013 23:23, Chris Behrens cbehr...@codestud.com wrote: On Oct 25, 2013, at 3:46 AM, Day, Phil philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. Agree. I've argued for a long time that our code should be able to handle the instance disappearing. We do have a number of places where we catch InstanceNotFound to handle this already. +1 we need to get better at that ii) Record in the API server that a delete has been started (maybe enough to use the task state being set to DELETEING in the API if we're sure this doesn't get cleared), and add a periodic task in the compute manager to check for and delete instances that are in a DELETING state for more than some timeout. Then the API, knowing that the delete will be processes eventually can just no-op any further delete requests. We already set to DELETING in the API (unless I'm mistaken -- but I looked at this recently). However, instead of dropping duplicate deletes, I say they should still be sent/handled. Any delete code should be able to handle if another delete is occurring at the same time, IMO… much like how you say other methods should be able to handle an instance disappearing from underneath. If a compute goes down while 'deleting', a 2nd delete later should still be able to function locally. Same thing if the message to compute happens to be lost. +1 the periodic sync task, if the compute comes back after crashing having lost the delete message, should help spot the in-consistency and possibly resolve it. We probably need to make those in-consistency log messages into notifications so its a bit easier to find. iii) Add some hook into the ServiceGroup API so that the timer could depend on getting a free thread from the compute manager pool (ie run some no-op task) - so that of there are no free threads then the service becomes down. That would (eventually) stop the scheduler from sending new requests to it, and make deleted be processed in the API server but won't of course help with commands for other instances on the same host. This seems kinda hacky to me. I hope we don't need this. iv) Move away from having a general topic and thread pool for all requests, and start a listener on an instance specific topic for each running instance on a host (leaving the general topic and pool just for creates and other non-instance calls like the hypervisor API). Then a blocked task would only affect request for a specific instance. I don't like this one when thinking about scale. 1 million instances = = 1 million more queues. +1 I'm tending towards ii) as a simple and pragmatic solution in the near term, although I like both iii) and iv) as being both generally good enhancments - but iv) in particular feels like a pretty seismic change. I vote for both i) and ii) at minimum. +1 I also have another idea, so we can better track the user intent, idea (v): * changing the API to be more task based (see the summit session) * We would then know what api requests the user has made, and in roughly what order * If the user has already called delete, we can
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
I’d disagree that that – from a user perspective they should always be able to delete an Instance regardless of its state, and the delete should always work (or at least always appear to work to the user so that it no longer counts against their quota, and they are no longer charged for it) From: Abhishek Lahiri [mailto:aviost...@gmail.com] Sent: 26 October 2013 17:10 To: OpenStack Development Mailing List Cc: OpenStack Development Mailing List Subject: Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem Deletes should only be allowed when the vm is in a power off state. This will allow consistent state transition. Thanks Al On Oct 26, 2013, at 8:55 AM, Joshua Harlow harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote: I think I will try to have a unconference at the HK summit about ideas the cinder developers (and the taskflow developers, since it's not a concept that is unique /applicable to just cinder) are having about said state machine (and it's potential usage). So look out for that, be interesting to have some nova folks involved there also :-) Sent from my really tiny device... On Oct 26, 2013, at 3:14 AM, Alex Glikson glik...@il.ibm.commailto:glik...@il.ibm.com wrote: +1 Regards, Alex Joshua Harlow harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote on 26/10/2013 09:29:03 AM: An idea that others and I are having for a similar use case in cinder (or it appears to be similar). If there was a well defined state machine/s in nova with well defined and managed transitions between states then it seems like this state machine could resume on failure as well as be interrupted when a dueling or preemptable operation arrives (a delete while being created for example). This way not only would it be very clear the set of states and transitions but it would also be clear how preemption occurs (and under what cases). Right now in nova there is a distributed and ad-hoc state machine which if it was more formalized it could inherit some if the described useful capabilities. It would also be much more resilient to these types of locking problems that u described. IMHO that's the only way these types of problems will be fully be fixed, not by more queues or more periodic tasks, but by solidifying formalizing the state machines that compose the work nova does. Sent from my really tiny device... On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.commailto:philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. ii) Record in the API server that a delete has been started (maybe enough to use the task state being set to DELETEING in the API if we're sure this doesn't get cleared), and add a periodic task in the compute manager to check for and delete instances that are in a DELETING state for more than some timeout. Then the API, knowing that the delete will be processes eventually can just no-op any further delete requests. iii) Add some hook into the ServiceGroup API so that the timer could depend on getting a free thread from the compute manager pool (ie run
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
Except I think the CAP theorem would say that u can't accurately give back there quota under thing like network partitions. If nova-compute and the message queue have a network partition then u can release there quota but can't actually delete there vms. I would actually prefer to not release there quota, but then this should be a deployer decision and not a one size fits all decision (IMHO). Sent from my really tiny device... On Oct 28, 2013, at 7:18 AM, Day, Phil philip@hp.commailto:philip@hp.com wrote: I’d disagree that that – from a user perspective they should always be able to delete an Instance regardless of its state, and the delete should always work (or at least always appear to work to the user so that it no longer counts against their quota, and they are no longer charged for it) From: Abhishek Lahiri [mailto:aviost...@gmail.com] Sent: 26 October 2013 17:10 To: OpenStack Development Mailing List Cc: OpenStack Development Mailing List Subject: Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem Deletes should only be allowed when the vm is in a power off state. This will allow consistent state transition. Thanks Al On Oct 26, 2013, at 8:55 AM, Joshua Harlow harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote: I think I will try to have a unconference at the HK summit about ideas the cinder developers (and the taskflow developers, since it's not a concept that is unique /applicable to just cinder) are having about said state machine (and it's potential usage). So look out for that, be interesting to have some nova folks involved there also :-) Sent from my really tiny device... On Oct 26, 2013, at 3:14 AM, Alex Glikson glik...@il.ibm.commailto:glik...@il.ibm.com wrote: +1 Regards, Alex Joshua Harlow harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote on 26/10/2013 09:29:03 AM: An idea that others and I are having for a similar use case in cinder (or it appears to be similar). If there was a well defined state machine/s in nova with well defined and managed transitions between states then it seems like this state machine could resume on failure as well as be interrupted when a dueling or preemptable operation arrives (a delete while being created for example). This way not only would it be very clear the set of states and transitions but it would also be clear how preemption occurs (and under what cases). Right now in nova there is a distributed and ad-hoc state machine which if it was more formalized it could inherit some if the described useful capabilities. It would also be much more resilient to these types of locking problems that u described. IMHO that's the only way these types of problems will be fully be fixed, not by more queues or more periodic tasks, but by solidifying formalizing the state machines that compose the work nova does. Sent from my really tiny device... On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.commailto:philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. ii) Record in the API server that a delete
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
Excerpts from Joshua Harlow's message of 2013-10-28 09:01:44 -0700: Except I think the CAP theorem would say that u can't accurately give back there quota under thing like network partitions. If nova-compute and the message queue have a network partition then u can release there quota but can't actually delete there vms. I would actually prefer to not release there quota, but then this should be a deployer decision and not a one size fits all decision (IMHO). CAP encourages convergence models to satisfy problems with consistency. Quotas and records of allocated resources are records of intention and we can converge the physical resources with the expressed intentions later. The speed with which you do that is part of the cost of network partition failures and should be considered when assessing and mitigating risk. It is really bad customer service to tell somebody Yes I know you've asked me to stop charging you, but my equipment has failed so I MUST keep charging you. Reminds me of that gym membership I tried to cancel... _TRIED_. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
I wish everything was so simple in distributed systems (like openstack) but there are real boundaries and limits to doing something like a kill -9 correctly while retaining the consistency of the resources in your cloud (any inconsistency costs someone $$$). Sent from my really tiny device... On Oct 28, 2013, at 8:11 AM, Chris Friesen chris.frie...@windriver.com wrote: Yes, exactly this. If my compute node crashes and is unavailable I should still be able to delete the instance. Heck, I should be able to create an instance and delete it while it's still in the building stage. It's like a kill -9 in posix...the underlying system should clean up underneath it. And yes, just like the process kill there may be side effects like corrupt file systems on cinder volumes. Chris On 10/28/2013 08:10 AM, Day, Phil wrote: I’d disagree that that – from a user perspective they should always be able to delete an Instance regardless of its state, and the delete should always work (or at least always appear to work to the user so that it no longer counts against their quota, and they are no longer charged for it) *From:*Abhishek Lahiri [mailto:aviost...@gmail.com] *Sent:* 26 October 2013 17:10 *To:* OpenStack Development Mailing List *Cc:* OpenStack Development Mailing List *Subject:* Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem Deletes should only be allowed when the vm is in a power off state. This will allow consistent state transition. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
On 10/28/2013 10:30 AM, Joshua Harlow wrote: I wish everything was so simple in distributed systems (like openstack) but there are real boundaries and limits to doing something like a kill -9 correctly while retaining the consistency of the resources in your cloud (any inconsistency costs someone $$$). Arguably that cost needs to be factored in by the cloud provider as a cost of doing business. As Clint said, once an end-user says I want to stop using these resources, they shouldn't be charged for them anymore. If the provider's system is currently broken and can't properly process the request right away, that's their problem and not the end user's. So if there are technological limitations as to what can be done, you register that the user wants to clean everything up, you stop charging them, and then you clean things up as fast as you can behind the scenes. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
But there is a difference here that I think needs to be clear. Releasing the resources from nova (in the current way its done) means another individual can take those resources and that causes inconsistencies (bad for deployer). I think we talked about how we can make this better by putting the resources into a 'not-yet-deleted' state, where they can no be taken. But this has side-effects in itself that need to be thought out carefully, as those resources are potentially still 'active' so a malicious user will now have access to more resources than there quota allows (+1 for malicious user). And if the malicious user is especially malicious they can take advantage of the fact that all the deletes are going into the 'not-yet-deleted' state and they can the DOS your resources (in a way). That¹s why I prefer consistency and just denying the delete, as I believe it is more simple, although as u said, the end-user won't be as 'happy'. On 10/28/13 9:56 AM, Chris Friesen chris.frie...@windriver.com wrote: On 10/28/2013 10:30 AM, Joshua Harlow wrote: I wish everything was so simple in distributed systems (like openstack) but there are real boundaries and limits to doing something like a kill -9 correctly while retaining the consistency of the resources in your cloud (any inconsistency costs someone $$$). Arguably that cost needs to be factored in by the cloud provider as a cost of doing business. As Clint said, once an end-user says I want to stop using these resources, they shouldn't be charged for them anymore. If the provider's system is currently broken and can't properly process the request right away, that's their problem and not the end user's. So if there are technological limitations as to what can be done, you register that the user wants to clean everything up, you stop charging them, and then you clean things up as fast as you can behind the scenes. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
Sure, convergence model is great and likely how it has to be done. Its just a question of what is that convergence model :) I agree that its bad customer service to say 'yes u tried to delete it but I am charging u anyway' but I think the difference is that the user actually still has access to those resources when they are not completed deletion (due to say a network partition). So this makes it a nice feature for malicious users to take advantage of, freeing there quota while still have access to the resources that previously existed under that quota. I'd sure like that if I was a malicious user (free stuff!). Quotas are as u said 'records of intentions' but they also permit/deny access to further resources, and its the further resources that are the problem, not the record of intention (which at its simplest is just a write-ahead-log). What is stopping that write-ahead-log from being used at/in the billing 'system' and removing 'charges' for deletes that have not completed (if this is how a deployer wants to operate)? IMHO, I think this all goes back to having a well defined state-machine in nova (and elsewhere), where that state-machine can be altered to have states that may say prefer consistency vs user happiness. On 10/28/13 9:29 AM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Joshua Harlow's message of 2013-10-28 09:01:44 -0700: Except I think the CAP theorem would say that u can't accurately give back there quota under thing like network partitions. If nova-compute and the message queue have a network partition then u can release there quota but can't actually delete there vms. I would actually prefer to not release there quota, but then this should be a deployer decision and not a one size fits all decision (IMHO). CAP encourages convergence models to satisfy problems with consistency. Quotas and records of allocated resources are records of intention and we can converge the physical resources with the expressed intentions later. The speed with which you do that is part of the cost of network partition failures and should be considered when assessing and mitigating risk. It is really bad customer service to tell somebody Yes I know you've asked me to stop charging you, but my equipment has failed so I MUST keep charging you. Reminds me of that gym membership I tried to cancel... _TRIED_. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
On 10/28/2013 12:01 PM, Joshua Harlow wrote: But there is a difference here that I think needs to be clear. Releasing the resources from nova (in the current way its done) means another individual can take those resources and that causes inconsistencies (bad for deployer). I think we talked about how we can make this better by putting the resources into a 'not-yet-deleted' state, where they can no be taken. But this has side-effects in itself that need to be thought out carefully, as those resources are potentially still 'active' so a malicious user will now have access to more resources than there quota allows (+1 for malicious user). And if the malicious user is especially malicious they can take advantage of the fact that all the deletes are going into the 'not-yet-deleted' state and they can the DOS your resources (in a way). If the cloud operator's equipment is in a good state, the time spent in the zombie state should be minimal. The issues only occur when there are problems on the hosting side, and hopefully that doesn't happen very often. And maybe we get fancy and put a limit as to how much of a user's equipment can be in the zombie state at any given time. If they start trying to game the system then they get throttled. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
An idea that others and I are having for a similar use case in cinder (or it appears to be similar). If there was a well defined state machine/s in nova with well defined and managed transitions between states then it seems like this state machine could resume on failure as well as be interrupted when a dueling or preemptable operation arrives (a delete while being created for example). This way not only would it be very clear the set of states and transitions but it would also be clear how preemption occurs (and under what cases). Right now in nova there is a distributed and ad-hoc state machine which if it was more formalized it could inherit some if the described useful capabilities. It would also be much more resilient to these types of locking problems that u described. IMHO that's the only way these types of problems will be fully be fixed, not by more queues or more periodic tasks, but by solidifying formalizing the state machines that compose the work nova does. Sent from my really tiny device... On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. ii) Record in the API server that a delete has been started (maybe enough to use the task state being set to DELETEING in the API if we're sure this doesn't get cleared), and add a periodic task in the compute manager to check for and delete instances that are in a DELETING state for more than some timeout. Then the API, knowing that the delete will be processes eventually can just no-op any further delete requests. iii) Add some hook into the ServiceGroup API so that the timer could depend on getting a free thread from the compute manager pool (ie run some no-op task) - so that of there are no free threads then the service becomes down. That would (eventually) stop the scheduler from sending new requests to it, and make deleted be processed in the API server but won't of course help with commands for other instances on the same host. iv) Move away from having a general topic and thread pool for all requests, and start a listener on an instance specific topic for each running instance on a host (leaving the general topic and pool just for creates and other non-instance calls like the hypervisor API). Then a blocked task would only affect request for a specific instance. I'm tending towards ii) as a simple and pragmatic solution in the near term, although I like both iii) and iv) as being both generally good enhancments - but iv) in particular feels like a pretty seismic change. Thoughts please, Phil ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
+1 Regards, Alex Joshua Harlow harlo...@yahoo-inc.com wrote on 26/10/2013 09:29:03 AM: An idea that others and I are having for a similar use case in cinder (or it appears to be similar). If there was a well defined state machine/s in nova with well defined and managed transitions between states then it seems like this state machine could resume on failure as well as be interrupted when a dueling or preemptable operation arrives (a delete while being created for example). This way not only would it be very clear the set of states and transitions but it would also be clear how preemption occurs (and under what cases). Right now in nova there is a distributed and ad-hoc state machine which if it was more formalized it could inherit some if the described useful capabilities. It would also be much more resilient to these types of locking problems that u described. IMHO that's the only way these types of problems will be fully be fixed, not by more queues or more periodic tasks, but by solidifying formalizing the state machines that compose the work nova does. Sent from my really tiny device... On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. ii) Record in the API server that a delete has been started (maybe enough to use the task state being set to DELETEING in the API if we're sure this doesn't get cleared), and add a periodic task in the compute manager to check for and delete instances that are in a DELETING state for more than some timeout. Then the API, knowing that the delete will be processes eventually can just no-op any further delete requests. iii) Add some hook into the ServiceGroup API so that the timer could depend on getting a free thread from the compute manager pool (ie run some no-op task) - so that of there are no free threads then the service becomes down. That would (eventually) stop the scheduler from sending new requests to it, and make deleted be processed in the API server but won't of course help with commands for other instances on the same host. iv) Move away from having a general topic and thread pool for all requests, and start a listener on an instance specific topic for each running instance on a host (leaving the general topic and pool just for creates and other non-instance calls like the hypervisor API). Then a blocked task would only affect request for a specificinstance. I'm tending towards ii) as a simple and pragmatic solution in the near term, although I like both iii) and iv) as being both generally good enhancments - but iv) in particular feels like a pretty seismic change. Thoughts please, Phil ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
Deletes should only be allowed when the vm is in a power off state. This will allow consistent state transition. Thanks Al On Oct 26, 2013, at 8:55 AM, Joshua Harlow harlo...@yahoo-inc.com wrote: I think I will try to have a unconference at the HK summit about ideas the cinder developers (and the taskflow developers, since it's not a concept that is unique /applicable to just cinder) are having about said state machine (and it's potential usage). So look out for that, be interesting to have some nova folks involved there also :-) Sent from my really tiny device... On Oct 26, 2013, at 3:14 AM, Alex Glikson glik...@il.ibm.com wrote: +1 Regards, Alex Joshua Harlow harlo...@yahoo-inc.com wrote on 26/10/2013 09:29:03 AM: An idea that others and I are having for a similar use case in cinder (or it appears to be similar). If there was a well defined state machine/s in nova with well defined and managed transitions between states then it seems like this state machine could resume on failure as well as be interrupted when a dueling or preemptable operation arrives (a delete while being created for example). This way not only would it be very clear the set of states and transitions but it would also be clear how preemption occurs (and under what cases). Right now in nova there is a distributed and ad-hoc state machine which if it was more formalized it could inherit some if the described useful capabilities. It would also be much more resilient to these types of locking problems that u described. IMHO that's the only way these types of problems will be fully be fixed, not by more queues or more periodic tasks, but by solidifying formalizing the state machines that compose the work nova does. Sent from my really tiny device... On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. ii) Record in the API server that a delete has been started (maybe enough to use the task state being set to DELETEING in the API if we're sure this doesn't get cleared), and add a periodic task in the compute manager to check for and delete instances that are in a DELETING state for more than some timeout. Then the API, knowing that the delete will be processes eventually can just no-op any further delete requests. iii) Add some hook into the ServiceGroup API so that the timer could depend on getting a free thread from the compute manager pool (ie run some no-op task) - so that of there are no free threads then the service becomes down. That would (eventually) stop the scheduler from sending new requests to it, and make deleted be processed in the API server but won't of course help with commands for other instances on the same host. iv) Move away from having a general topic and thread pool for all requests, and start a listener on an instance specific topic for each running instance on a host (leaving the general topic and pool just for creates and other non-instance calls like
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
Potentially, Although I think the lack of formalization and visibility (and the ability to easily change its transitions) into the state machine is at this point causing part of this pain. If the state machine was well defined (and adjustable - to a degree...) then you could imagine only allowing delete transitions from the powered off state while another deployer may prefer a more experimental allow delete transition at any time (with associated side effects). Without the underlying state machine being in a more formalized state it's hard to do either in a manageable (and recoverable resumable...) manner IMHO. Sent from my really tiny device... On Oct 26, 2013, at 9:12 AM, Abhishek Lahiri aviost...@gmail.commailto:aviost...@gmail.com wrote: Deletes should only be allowed when the vm is in a power off state. This will allow consistent state transition. Thanks Al On Oct 26, 2013, at 8:55 AM, Joshua Harlow harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote: I think I will try to have a unconference at the HK summit about ideas the cinder developers (and the taskflow developers, since it's not a concept that is unique /applicable to just cinder) are having about said state machine (and it's potential usage). So look out for that, be interesting to have some nova folks involved there also :-) Sent from my really tiny device... On Oct 26, 2013, at 3:14 AM, Alex Glikson glik...@il.ibm.commailto:glik...@il.ibm.com wrote: +1 Regards, Alex Joshua Harlow harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote on 26/10/2013 09:29:03 AM: An idea that others and I are having for a similar use case in cinder (or it appears to be similar). If there was a well defined state machine/s in nova with well defined and managed transitions between states then it seems like this state machine could resume on failure as well as be interrupted when a dueling or preemptable operation arrives (a delete while being created for example). This way not only would it be very clear the set of states and transitions but it would also be clear how preemption occurs (and under what cases). Right now in nova there is a distributed and ad-hoc state machine which if it was more formalized it could inherit some if the described useful capabilities. It would also be much more resilient to these types of locking problems that u described. IMHO that's the only way these types of problems will be fully be fixed, not by more queues or more periodic tasks, but by solidifying formalizing the state machines that compose the work nova does. Sent from my really tiny device... On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.commailto:philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. ii) Record in the API server that a delete has been started (maybe enough to use the task state being set to DELETEING in the API if we're sure this doesn't get cleared), and add a periodic task in the compute manager to check for and delete instances that are in a DELETING state for more than some timeout. Then the API, knowing that the delete will be processes eventually can just no-op any
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
This is a bit off topic , but in general it seems to me that the state transitions as you said are not clearly defined for many openstack components. Is there any effort underway to define these? As the software gets bigger and bigger this will help both developers and operators. Thanks Regards Abhishek Lahiri On Oct 26, 2013, at 9:52 AM, Joshua Harlow harlo...@yahoo-inc.com wrote: Potentially, Although I think the lack of formalization and visibility (and the ability to easily change its transitions) into the state machine is at this point causing part of this pain. If the state machine was well defined (and adjustable - to a degree...) then you could imagine only allowing delete transitions from the powered off state while another deployer may prefer a more experimental allow delete transition at any time (with associated side effects). Without the underlying state machine being in a more formalized state it's hard to do either in a manageable (and recoverable resumable...) manner IMHO. Sent from my really tiny device... On Oct 26, 2013, at 9:12 AM, Abhishek Lahiri aviost...@gmail.com wrote: Deletes should only be allowed when the vm is in a power off state. This will allow consistent state transition. Thanks Al On Oct 26, 2013, at 8:55 AM, Joshua Harlow harlo...@yahoo-inc.com wrote: I think I will try to have a unconference at the HK summit about ideas the cinder developers (and the taskflow developers, since it's not a concept that is unique /applicable to just cinder) are having about said state machine (and it's potential usage). So look out for that, be interesting to have some nova folks involved there also :-) Sent from my really tiny device... On Oct 26, 2013, at 3:14 AM, Alex Glikson glik...@il.ibm.com wrote: +1 Regards, Alex Joshua Harlow harlo...@yahoo-inc.com wrote on 26/10/2013 09:29:03 AM: An idea that others and I are having for a similar use case in cinder (or it appears to be similar). If there was a well defined state machine/s in nova with well defined and managed transitions between states then it seems like this state machine could resume on failure as well as be interrupted when a dueling or preemptable operation arrives (a delete while being created for example). This way not only would it be very clear the set of states and transitions but it would also be clear how preemption occurs (and under what cases). Right now in nova there is a distributed and ad-hoc state machine which if it was more formalized it could inherit some if the described useful capabilities. It would also be much more resilient to these types of locking problems that u described. IMHO that's the only way these types of problems will be fully be fixed, not by more queues or more periodic tasks, but by solidifying formalizing the state machines that compose the work nova does. Sent from my really tiny device... On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well.
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
There is at least 1 such effort being discussed in cinder, other projects I can not say. I am hoping to gain more traction there as taskflow[1] I think can provide (or help provide) a foundation to help here. Taskflow itself has a well defined state machine [2]. But it's in the end up to the projects themselves to see the issue and have a desire to move to a more formalized and managed model... 1. https://wiki.openstack.org/wiki/TaskFlow 2. https://wiki.openstack.org/wiki/TaskFlow/States_of_Task_and_Flow Sent from my really tiny device... On Oct 26, 2013, at 12:22 PM, Abhishek Lahiri aviost...@gmail.commailto:aviost...@gmail.com wrote: This is a bit off topic , but in general it seems to me that the state transitions as you said are not clearly defined for many openstack components. Is there any effort underway to define these? As the software gets bigger and bigger this will help both developers and operators. Thanks Regards Abhishek Lahiri On Oct 26, 2013, at 9:52 AM, Joshua Harlow harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote: Potentially, Although I think the lack of formalization and visibility (and the ability to easily change its transitions) into the state machine is at this point causing part of this pain. If the state machine was well defined (and adjustable - to a degree...) then you could imagine only allowing delete transitions from the powered off state while another deployer may prefer a more experimental allow delete transition at any time (with associated side effects). Without the underlying state machine being in a more formalized state it's hard to do either in a manageable (and recoverable resumable...) manner IMHO. Sent from my really tiny device... On Oct 26, 2013, at 9:12 AM, Abhishek Lahiri aviost...@gmail.commailto:aviost...@gmail.com wrote: Deletes should only be allowed when the vm is in a power off state. This will allow consistent state transition. Thanks Al On Oct 26, 2013, at 8:55 AM, Joshua Harlow harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote: I think I will try to have a unconference at the HK summit about ideas the cinder developers (and the taskflow developers, since it's not a concept that is unique /applicable to just cinder) are having about said state machine (and it's potential usage). So look out for that, be interesting to have some nova folks involved there also :-) Sent from my really tiny device... On Oct 26, 2013, at 3:14 AM, Alex Glikson glik...@il.ibm.commailto:glik...@il.ibm.com wrote: +1 Regards, Alex Joshua Harlow harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote on 26/10/2013 09:29:03 AM: An idea that others and I are having for a similar use case in cinder (or it appears to be similar). If there was a well defined state machine/s in nova with well defined and managed transitions between states then it seems like this state machine could resume on failure as well as be interrupted when a dueling or preemptable operation arrives (a delete while being created for example). This way not only would it be very clear the set of states and transitions but it would also be clear how preemption occurs (and under what cases). Right now in nova there is a distributed and ad-hoc state machine which if it was more formalized it could inherit some if the described useful capabilities. It would also be much more resilient to these types of locking problems that u described. IMHO that's the only way these types of problems will be fully be fixed, not by more queues or more periodic tasks, but by solidifying formalizing the state machines that compose the work nova does. Sent from my really tiny device... On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.commailto:philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
On 25 October 2013 23:46, Day, Phil philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. I like this. ii) Record in the API server that a delete has been started (maybe enough to use the task state being set to DELETEING in the API if we're sure this doesn't get cleared), and add a periodic task in the compute manager to check for and delete instances that are in a DELETING state for more than some timeout. Then the API, knowing that the delete will be processes eventually can just no-op any further delete requests. There may be multiple API servers; global state in an API server seems fraught with issues. iii) Add some hook into the ServiceGroup API so that the timer could depend on getting a free thread from the compute manager pool (ie run some no-op task) - so that of there are no free threads then the service becomes down. That would (eventually) stop the scheduler from sending new requests to it, and make deleted be processed in the API server but won't of course help with commands for other instances on the same host. This seems a little kludgy to me. iv) Move away from having a general topic and thread pool for all requests, and start a listener on an instance specific topic for each running instance on a host (leaving the general topic and pool just for creates and other non-instance calls like the hypervisor API). Then a blocked task would only affect request for a specific instance. That seems to suggest instance # topics? Aieee. I don't think that solves the problem anyway, because either a) you end up with a tonne of threads, or b) you have a multiplexing thread with the same potential issue. You could more simply just have a dedicated thread pool for deletes, and have no thread limit on the pool. Of course, this will fail when you OOM :). You could do a dict with instance - thread for deletes instead, without creating lots of queues. I'm tending towards ii) as a simple and pragmatic solution in the near term, although I like both iii) and iv) as being both generally good enhancments - but iv) in particular feels like a pretty seismic change. My inclination would be (i) - make deletes nonblocking idempotent with lazy cleanup if resources take a while to tear down. -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
There may be multiple API servers; global state in an API server seems fraught with issues. No, the state would be in the DB (it would either be a task_state of Deleteing or some new delete_stated_at timestamp I agree that i) is nice and simple - it just has the minor risks that the delete itself could hang, and/or that we might find some other issues with bits of the code that can't cope at the moment with the instance being deleted from underneath them -Original Message- From: Robert Collins [mailto:robe...@robertcollins.net] Sent: 25 October 2013 12:21 To: OpenStack Development Mailing List Subject: Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem On 25 October 2013 23:46, Day, Phil philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. I like this. ii) Record in the API server that a delete has been started (maybe enough to use the task state being set to DELETEING in the API if we're sure this doesn't get cleared), and add a periodic task in the compute manager to check for and delete instances that are in a DELETING state for more than some timeout. Then the API, knowing that the delete will be processes eventually can just no-op any further delete requests. There may be multiple API servers; global state in an API server seems fraught with issues. iii) Add some hook into the ServiceGroup API so that the timer could depend on getting a free thread from the compute manager pool (ie run some no-op task) - so that of there are no free threads then the service becomes down. That would (eventually) stop the scheduler from sending new requests to it, and make deleted be processed in the API server but won't of course help with commands for other instances on the same host. This seems a little kludgy to me. iv) Move away from having a general topic and thread pool for all requests, and start a listener on an instance specific topic for each running instance on a host (leaving the general topic and pool just for creates and other non-instance calls like the hypervisor API). Then a blocked task would only affect request for a specific instance. That seems to suggest instance # topics? Aieee. I don't think that solves the problem anyway, because either a) you end up with a tonne of threads, or b) you have a multiplexing thread with the same potential issue. You could more simply just have a dedicated thread pool for deletes, and have no thread limit on the pool. Of course, this will fail when you OOM :). You could do a dict with instance - thread for deletes instead, without creating lots of queues. I'm tending towards ii) as a simple and pragmatic solution in the near term, although I like both iii) and iv) as being both generally good enhancments - but iv) in particular feels like a pretty seismic change. My inclination would be (i) - make deletes nonblocking idempotent with lazy cleanup if resources take a while to tear down. -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
Excerpts from Day, Phil's message of 2013-10-25 03:46:01 -0700: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. Almost anything unexpected that isn't start the creation results in just marking an instance as an ERROR right? So this approach is actually pretty straight forward to implement. You don't really have to make other operations any more intelligent than they already should be in cleaning up half-done operations when they encounter an error. It might be helpful to suppress or de-prioritize logging of these errors when it is obvious that this result was intended. ii) Record in the API server that a delete has been started (maybe enough to use the task state being set to DELETEING in the API if we're sure this doesn't get cleared), and add a periodic task in the compute manager to check for and delete instances that are in a DELETING state for more than some timeout. Then the API, knowing that the delete will be processes eventually can just no-op any further delete requests. s/API server/database/ right? I like the coalescing approach where you no longer take up more resources for repeated requests. I don't like the garbage collection aspect of this plan though.Garbage collection is a trade off of user experience for resources. If your GC thread gets too far behind your resources will be exhausted. If you make it too active, it wastes resources doing the actual GC. Add in that you have a timeout before things can be garbage collected and I think this becomes a very tricky thing to tune, and it may not be obvious it needs to be tuned until you have a user who does a lot of rapid create/delete cycles. iii) Add some hook into the ServiceGroup API so that the timer could depend on getting a free thread from the compute manager pool (ie run some no-op task) - so that of there are no free threads then the service becomes down. That would (eventually) stop the scheduler from sending new requests to it, and make deleted be processed in the API server but won't of course help with commands for other instances on the same host. I'm not sure I understand this one. iv) Move away from having a general topic and thread pool for all requests, and start a listener on an instance specific topic for each running instance on a host (leaving the general topic and pool just for creates and other non-instance calls like the hypervisor API). Then a blocked task would only affect request for a specific instance. A topic per record will get out of hand rapidly. If you think of the instance record in the DB as the topic though, then (i) and (iv) are actually quite similar. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
-Original Message- From: Clint Byrum [mailto:cl...@fewbar.com] Sent: 25 October 2013 17:05 To: openstack-dev Subject: Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem Excerpts from Day, Phil's message of 2013-10-25 03:46:01 -0700: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. Almost anything unexpected that isn't start the creation results in just marking an instance as an ERROR right? So this approach is actually pretty straight forward to implement. You don't really have to make other operations any more intelligent than they already should be in cleaning up half-done operations when they encounter an error. It might be helpful to suppress or de-prioritize logging of these errors when it is obvious that this result was intended. ii) Record in the API server that a delete has been started (maybe enough to use the task state being set to DELETEING in the API if we're sure this doesn't get cleared), and add a periodic task in the compute manager to check for and delete instances that are in a DELETING state for more than some timeout. Then the API, knowing that the delete will be processes eventually can just no-op any further delete requests. s/API server/database/ right? I like the coalescing approach where you no longer take up more resources for repeated requests. Yep, the state is saved in the DB, but its set by the API server - that's what I meant. So it's not dependent on the manager getting the delete. I don't like the garbage collection aspect of this plan though.Garbage collection is a trade off of user experience for resources. If your GC thread gets too far behind your resources will be exhausted. If you make it too active, it wastes resources doing the actual GC. Add in that you have a timeout before things can be garbage collected and I think this becomes a very tricky thing to tune, and it may not be obvious it needs to be tuned until you have a user who does a lot of rapid create/delete cycles. The GC is just a backstop here - you always let the first delete message through so normally things work as they do now. Its only if the delete message doesn't get processed for some reason that the GC would kick in. There are already examples of this kind of clean-up in other periodic tasks. iii) Add some hook into the ServiceGroup API so that the timer could depend on getting a free thread from the compute manager pool (ie run some no-op task) - so that of there are no free threads then the service becomes down. That would (eventually) stop the scheduler from sending new requests to it, and make deleted be processed in the API server but won't of course help with commands for other instances on the same host. I'm not sure I understand this one. At the moment the liveness of a service is determined by a separate thread in the ServiceGroup class - all it really shows is that something in the manager is still running. What I was thinking of is extending that so that it shows that the manager is still capable of doing something useful. Doing some
Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem
On Oct 25, 2013, at 3:46 AM, Day, Phil philip@hp.com wrote: Hi Folks, We're very occasionally seeing problems where a thread processing a create hangs (and we've seen when taking to Cinder and Glance). Whilst those issues need to be hunted down in their own rights, they do show up what seems to me to be a weakness in the processing of delete requests that I'd like to get some feedback on. Delete is the one operation that is allowed regardless of the Instance state (since it's a one-way operation, and users should always be able to free up their quota). However when we get a create thread hung in one of these states, the delete requests when they hit the manager will also block as they are synchronized on the uuid. Because the user making the delete request doesn't see anything happen they tend to submit more delete requests. The Service is still up, so these go to the computer manager as well, and eventually all of the threads will be waiting for the lock, and the compute manager will stop consuming new messages. The problem isn't limited to deletes - although in most cases the change of state in the API means that you have to keep making different calls to get past the state checker logic to do it with an instance stuck in another state. Users also seem to be more impatient with deletes, as they are trying to free up quota for other things. So while I know that we should never get a thread into a hung state into the first place, I was wondering about one of the following approaches to address just the delete case: i) Change the delete call on the manager so it doesn't wait for the uuid lock. Deletes should be coded so that they work regardless of the state of the VM, and other actions should be able to cope with a delete being performed from under them. There is of course no guarantee that the delete itself won't block as well. Agree. I've argued for a long time that our code should be able to handle the instance disappearing. We do have a number of places where we catch InstanceNotFound to handle this already. ii) Record in the API server that a delete has been started (maybe enough to use the task state being set to DELETEING in the API if we're sure this doesn't get cleared), and add a periodic task in the compute manager to check for and delete instances that are in a DELETING state for more than some timeout. Then the API, knowing that the delete will be processes eventually can just no-op any further delete requests. We already set to DELETING in the API (unless I'm mistaken -- but I looked at this recently). However, instead of dropping duplicate deletes, I say they should still be sent/handled. Any delete code should be able to handle if another delete is occurring at the same time, IMO… much like how you say other methods should be able to handle an instance disappearing from underneath. If a compute goes down while 'deleting', a 2nd delete later should still be able to function locally. Same thing if the message to compute happens to be lost. iii) Add some hook into the ServiceGroup API so that the timer could depend on getting a free thread from the compute manager pool (ie run some no-op task) - so that of there are no free threads then the service becomes down. That would (eventually) stop the scheduler from sending new requests to it, and make deleted be processed in the API server but won't of course help with commands for other instances on the same host. This seems kinda hacky to me. iv) Move away from having a general topic and thread pool for all requests, and start a listener on an instance specific topic for each running instance on a host (leaving the general topic and pool just for creates and other non-instance calls like the hypervisor API). Then a blocked task would only affect request for a specific instance. I don't like this one when thinking about scale. 1 million instances = = 1 million more queues. I'm tending towards ii) as a simple and pragmatic solution in the near term, although I like both iii) and iv) as being both generally good enhancments - but iv) in particular feels like a pretty seismic change. I vote for both i) and ii) at minimum. - Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev