Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On 16/03/14 06:04, Clint Byrum wrote: I think you can achieve this level of protection simply by denying interactive users the rights to delete individual things directly, and using stop instead of delete. Then have something else (cron?) clean up stopped instances after a safety period has been reached. I would be very interested in the approach to determining the optimal value for that safety period you are mentioning. Or is this going to be left as an exercise for the reader? (That is, set in the configuration, so that the users have to somehow perform this impossible task.) -- Radomir Dopieralski ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Glance provides a very nice set up for this - Default is no delayed deletion - Length of time before scrubbing is configurable - The clean up process is automated using the glance scrubber which can be run as a standalone job or as a daemon Tim -Original Message- From: Radomir Dopieralski [mailto:openst...@sheep.art.pl] Sent: 17 March 2014 10:33 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) On 16/03/14 06:04, Clint Byrum wrote: I think you can achieve this level of protection simply by denying interactive users the rights to delete individual things directly, and using stop instead of delete. Then have something else (cron?) clean up stopped instances after a safety period has been reached. I would be very interested in the approach to determining the optimal value for that safety period you are mentioning. Or is this going to be left as an exercise for the reader? (That is, set in the configuration, so that the users have to somehow perform this impossible task.) -- Radomir Dopieralski ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Excerpts from Tim Bell's message of 2014-03-14 13:54:32 -0700: I think we need to split the scenarios and focus on the end user experience with the cloud a few come to my mind from the CERN experience (but this may not be all): 1. Accidental deletion of an object (including meta data) 2. Multi-level consistency (such as between Cell API and child instances) 3. Auditing CERN has the scenario 1 at a reasonable frequency. Ultimately, it is due to error by -- A - the openstack administrators themselves B - the delegated project administrators C - users with a non-optimised scope for administrative action D - users who make mistakes It seems that we should handle these as different cases 3 - make sure there is a log entry (ideally off the box) for all operations 2 - up to the component implementers but with the aim to expire deleted entries as soon as reasonable consistency is achieved 1[A-D] - how can we recover from operator/project admin/user error ? I understand that there are differing perspectives from cloud to server consolidation but my cloud users expect that if they create a one-off virtual desktop running Windows for software testing and install a set of software, I don't tell them it was accidentally deleted due to operator error (1A or 1B), you need to re-create it. Totally agree with all of your points. I think you can achieve this level of protection simply by denying interactive users the rights to delete individual things directly, and using stop instead of delete. Then have something else (cron?) clean up stopped instances after a safety period has been reached. This is an interesting counter to the opposite more fluid tactic which is to delete instances that have been up for too long, assuming that long lived == wrong and costly. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Hello, I also think that this thread is going in the wrong direction, but I don't think the direction Boris wants is the correct one either. Frankly I'm a little surprised that nobody mentioned another advantage that soft delete gives us, the one that I think it was actually used for originally. You see, soft delete is an optimization. It's there to make the system work faster as a whole, have less code and be simpler to maintain and debug. How does it do it, when, as clearly shown in the first post in this thread, it makes the queries slower, requires additional indices in the database and more logic in the queries? The answer is, by doing more with those queries, by making you write less code, execute fewer queries to the databases and avoid duplicating the same data in multiple places. OpenStack is a big, distributed system of multiple databases that sometimes rely on each other and cross-reference their records. It's not uncommon to have some long-running operation started, that uses some data, and then, in the middle of its execution, have that data deleted. With soft delete, that's not a problem -- the operation can continue safely and proceed as scheduled, with the data it was started with in the first place -- it still has access to the deleted records as if nothing happened. You simply won't be able to schedule another operation like that with the same data, because it has been soft-deleted and won't pass the validation at the beginning (or even won't appear in the UI or CLI). This solves a lot of race conditions, error handling, additional checks to make sure the record still exists, etc. Without soft delete, you need to write custom code every time to handle the case of a record being deleted mid-operation, including all the possible combinations of which record and when. Or you need to copy all the relevant data in advance over to whatever is executing that operation. This cannot be abstracted away entirely (although tools like TaskFlow help), as this is specific to the case you are handling. And it's not easy to find all the places where you can have a race condition like that -- especially when you are modifying existing code that has been relying on soft delete before. You can have bugs undetected for years, that only appear in production, on very large deployments, and are impossible to reproduce reliably. There are more similar cases like that, including cascading deletes and more advanced stuff, but I think this single case already shows that the advantages of soft delete out-weight its disadvantages. On 13/03/14 19:52, Boris Pavlovic wrote: Hi all, I would like to fix direction of this thread. Cause it is going in wrong direction. To assume: 1) Yes restoring already deleted recourses could be useful. 2) Current approach with soft deletion is broken by design and we should get rid of them. More about why I think that it is broken: 1) When you are restoring some resource you should restore N records from N tables (e.g. VM) 2) Restoring sometimes means not only restoring DB records. 3) Not all resources should be restorable (e.g. why I need to restore fixed_ip? or key-pairs?) So what we should think about is: 1) How to implement restoring functionally in common way (e.g. framework that will be in oslo) 2) Split of work of getting rid of soft deletion in steps (that I already mention): a) remove soft deletion from places where we are not using it b) replace internal code where we are using soft deletion to that framework c) replace API stuff using ceilometer (for logs) or this framework (for restorable stuff) To put in a nutshell: Restoring Delete resources / Delayed Deletion != Soft deletion. Best regards, Boris Pavlovic On Thu, Mar 13, 2014 at 9:21 PM, Mike Wilson geekinu...@gmail.com mailto:geekinu...@gmail.com wrote: For some guests we use the LVM imagebackend and there are times when the guest is deleted on accident. Humans, being what they are, don't back up their files and don't take care of important data, so it is not uncommon to use lvrestore and undelete an instance so that people can get their data. Of course, this is not always possible if the data has been subsequently overwritten. But it is common enough that I imagine most of our operators are familiar with how to do it. So I guess my saying that we do it on a regular basis is not quite accurate. Probably would be better to say that it is not uncommon to do this, but definitely not a daily task or something of that ilk. I have personally undeleted an instance a few times after accidental deletion also. I can't remember the specifics, but I do remember doing it :-). -Mike On Tue, Mar 11, 2014 at 12:46 PM, Johannes Erdfelt johan...@erdfelt.com mailto:johan...@erdfelt.com wrote: On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On 03/14/2014 09:37 AM, Radomir Dopieralski wrote: Hello, I also think that this thread is going in the wrong direction, but I don't think the direction Boris wants is the correct one either. Frankly I'm a little surprised that nobody mentioned another advantage that soft delete gives us, the one that I think it was actually used for originally. You see, soft delete is an optimization. It's there to make the system work faster as a whole, have less code and be simpler to maintain and debug. How does it do it, when, as clearly shown in the first post in this thread, it makes the queries slower, requires additional indices in the database and more logic in the queries? The answer is, by doing more with those queries, by making you write less code, execute fewer queries to the databases and avoid duplicating the same data in multiple places. OpenStack is a big, distributed system of multiple databases that sometimes rely on each other and cross-reference their records. It's not uncommon to have some long-running operation started, that uses some data, and then, in the middle of its execution, have that data deleted. With soft delete, that's not a problem -- the operation can continue safely and proceed as scheduled, with the data it was started with in the first place -- it still has access to the deleted records as if nothing happened. You simply won't be able to schedule another operation like that with the same data, because it has been soft-deleted and won't pass the validation at the beginning (or even won't appear in the UI or CLI). This solves a lot of race conditions, error handling, additional checks to make sure the record still exists, etc. 1) Operation in SQL are working in transactions so deleted records will be visible for other clients until transaction commit. 2) If someone inside the same transaction will try to use record that is already deleted it's definitely an error in our code and should be fixed. I don't think that such use case can be used as an argument to keep soft deleted records. Without soft delete, you need to write custom code every time to handle the case of a record being deleted mid-operation, including all the possible combinations of which record and when. Or you need to copy all the relevant data in advance over to whatever is executing that operation. This cannot be abstracted away entirely (although tools like TaskFlow help), as this is specific to the case you are handling. And it's not easy to find all the places where you can have a race condition like that -- especially when you are modifying existing code that has been relying on soft delete before. You can have bugs undetected for years, that only appear in production, on very large deployments, and are impossible to reproduce reliably. There are more similar cases like that, including cascading deletes and more advanced stuff, but I think this single case already shows that the advantages of soft delete out-weight its disadvantages. On 13/03/14 19:52, Boris Pavlovic wrote: Hi all, I would like to fix direction of this thread. Cause it is going in wrong direction. To assume: 1) Yes restoring already deleted recourses could be useful. 2) Current approach with soft deletion is broken by design and we should get rid of them. More about why I think that it is broken: 1) When you are restoring some resource you should restore N records from N tables (e.g. VM) 2) Restoring sometimes means not only restoring DB records. 3) Not all resources should be restorable (e.g. why I need to restore fixed_ip? or key-pairs?) So what we should think about is: 1) How to implement restoring functionally in common way (e.g. framework that will be in oslo) 2) Split of work of getting rid of soft deletion in steps (that I already mention): a) remove soft deletion from places where we are not using it b) replace internal code where we are using soft deletion to that framework c) replace API stuff using ceilometer (for logs) or this framework (for restorable stuff) To put in a nutshell: Restoring Delete resources / Delayed Deletion != Soft deletion. Best regards, Boris Pavlovic On Thu, Mar 13, 2014 at 9:21 PM, Mike Wilson geekinu...@gmail.com mailto:geekinu...@gmail.com wrote: For some guests we use the LVM imagebackend and there are times when the guest is deleted on accident. Humans, being what they are, don't back up their files and don't take care of important data, so it is not uncommon to use lvrestore and undelete an instance so that people can get their data. Of course, this is not always possible if the data has been subsequently overwritten. But it is common enough that I imagine most of our operators are familiar with how to do it. So I guess my saying that we do it on a regular basis is not quite accurate. Probably would be better to say that it is not uncommon to do this, but definitely not a daily task or something of that ilk. I have
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On 14/03/14 11:08, Alexei Kornienko wrote: On 03/14/2014 09:37 AM, Radomir Dopieralski wrote: [snip] OpenStack is a big, distributed system of multiple databases that sometimes rely on each other and cross-reference their records. It's not uncommon to have some long-running operation started, that uses some data, and then, in the middle of its execution, have that data deleted. With soft delete, that's not a problem -- the operation can continue safely and proceed as scheduled, with the data it was started with in the first place -- it still has access to the deleted records as if nothing happened. You simply won't be able to schedule another operation like that with the same data, because it has been soft-deleted and won't pass the validation at the beginning (or even won't appear in the UI or CLI). This solves a lot of race conditions, error handling, additional checks to make sure the record still exists, etc. 1) Operation in SQL are working in transactions so deleted records will be visible for other clients until transaction commit. 2) If someone inside the same transaction will try to use record that is already deleted it's definitely an error in our code and should be fixed. I don't think that such use case can be used as an argument to keep soft deleted records. Yes, that's why it works just fine when you have a single database in one place. You can have locks, transactions, cascading operations and all this stuff, and you have a guarantee that you are always in a consistent state, unless there is a horrible bug. OpenStack, however, is not a single database. There is no system-wide solution for locks, transactions or rollbacks. Every time you reference anything across databases, you are going to run into this problem. -- Radomir Dopieralski ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On Fri, 2014-03-14 at 08:37 +0100, Radomir Dopieralski wrote: Hello, I also think that this thread is going in the wrong direction, but I don't think the direction Boris wants is the correct one either. Frankly I'm a little surprised that nobody mentioned another advantage that soft delete gives us, the one that I think it was actually used for originally. You see, soft delete is an optimization. It's there to make the system work faster as a whole, have less code and be simpler to maintain and debug. How does it do it, when, as clearly shown in the first post in this thread, it makes the queries slower, requires additional indices in the database and more logic in the queries? I feel it isn't an optimization if: * It slows down the code base * Makes the code harder to read and understand * Deliberately obscures the actions of removing and restoring resources * Encourages the idea that everything in the system is undoable, like the cloud is a Word doc. The answer is, by doing more with those queries, by making you write less code, execute fewer queries to the databases and avoid duplicating the same data in multiple places. Fewer queries does not aklways make faster code, nor does it lead to inherently race-free code. OpenStack is a big, distributed system of multiple databases that sometimes rely on each other and cross-reference their records. It's not uncommon to have some long-running operation started, that uses some data, and then, in the middle of its execution, have that data deleted. With soft delete, that's not a problem -- the operation can continue safely and proceed as scheduled, with the data it was started with in the first place -- it still has access to the deleted records as if nothing happened. I believe a better solution would be to use Boris' solution and implement safeguards around the delete operation. For instance, not being able to delete an instance that has tasks still running against it. Either that, or implement true task abortion logic that can notify distributed components about the need to stop a running task because either the user wants to delete a resource or simply cancel the operation they began. You simply won't be able to schedule another operation like that with the same data, because it has been soft-deleted and won't pass the validation at the beginning (or even won't appear in the UI or CLI). This solves a lot of race conditions, error handling, additional checks to make sure the record still exists, etc. Sorry, I disagree here. Components that rely on the soft-delete behavior to get the resource data from the database should instead respond to a NotFound that gets raised by aborting their running task. Without soft delete, you need to write custom code every time to handle the case of a record being deleted mid-operation, including all the possible combinations of which record and when. Not custom code. Explicit code paths for explicit actions. Or you need to copy all the relevant data in advance over to whatever is executing that operation. This is already happening. This cannot be abstracted away entirely (although tools like TaskFlow help), as this is specific to the case you are handling. And it's not easy to find all the places where you can have a race condition like that -- especially when you are modifying existing code that has been relying on soft delete before. You can have bugs undetected for years, that only appear in production, on very large deployments, and are impossible to reproduce reliably. There are more similar cases like that, including cascading deletes and more advanced stuff, but I think this single case already shows that the advantages of soft delete out-weight its disadvantages. I respectfully disagree :) I think the benefits of explicit code paths and increased performance of the database outweigh the costs of changing existing code. Best, -jay On 13/03/14 19:52, Boris Pavlovic wrote: Hi all, I would like to fix direction of this thread. Cause it is going in wrong direction. To assume: 1) Yes restoring already deleted recourses could be useful. 2) Current approach with soft deletion is broken by design and we should get rid of them. More about why I think that it is broken: 1) When you are restoring some resource you should restore N records from N tables (e.g. VM) 2) Restoring sometimes means not only restoring DB records. 3) Not all resources should be restorable (e.g. why I need to restore fixed_ip? or key-pairs?) So what we should think about is: 1) How to implement restoring functionally in common way (e.g. framework that will be in oslo) 2) Split of work of getting rid of soft deletion in steps (that I already mention): a) remove soft deletion from places where we are not using it b) replace internal code where we are using soft deletion to that framework c) replace API stuff using
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
+1 to what Jay says here. This hidden behavior moistly just causes problems and allows hacking hidden ways to restore things. -Mike On Fri, Mar 14, 2014 at 9:55 AM, Jay Pipes jaypi...@gmail.com wrote: On Fri, 2014-03-14 at 08:37 +0100, Radomir Dopieralski wrote: Hello, I also think that this thread is going in the wrong direction, but I don't think the direction Boris wants is the correct one either. Frankly I'm a little surprised that nobody mentioned another advantage that soft delete gives us, the one that I think it was actually used for originally. You see, soft delete is an optimization. It's there to make the system work faster as a whole, have less code and be simpler to maintain and debug. How does it do it, when, as clearly shown in the first post in this thread, it makes the queries slower, requires additional indices in the database and more logic in the queries? I feel it isn't an optimization if: * It slows down the code base * Makes the code harder to read and understand * Deliberately obscures the actions of removing and restoring resources * Encourages the idea that everything in the system is undoable, like the cloud is a Word doc. The answer is, by doing more with those queries, by making you write less code, execute fewer queries to the databases and avoid duplicating the same data in multiple places. Fewer queries does not aklways make faster code, nor does it lead to inherently race-free code. OpenStack is a big, distributed system of multiple databases that sometimes rely on each other and cross-reference their records. It's not uncommon to have some long-running operation started, that uses some data, and then, in the middle of its execution, have that data deleted. With soft delete, that's not a problem -- the operation can continue safely and proceed as scheduled, with the data it was started with in the first place -- it still has access to the deleted records as if nothing happened. I believe a better solution would be to use Boris' solution and implement safeguards around the delete operation. For instance, not being able to delete an instance that has tasks still running against it. Either that, or implement true task abortion logic that can notify distributed components about the need to stop a running task because either the user wants to delete a resource or simply cancel the operation they began. You simply won't be able to schedule another operation like that with the same data, because it has been soft-deleted and won't pass the validation at the beginning (or even won't appear in the UI or CLI). This solves a lot of race conditions, error handling, additional checks to make sure the record still exists, etc. Sorry, I disagree here. Components that rely on the soft-delete behavior to get the resource data from the database should instead respond to a NotFound that gets raised by aborting their running task. Without soft delete, you need to write custom code every time to handle the case of a record being deleted mid-operation, including all the possible combinations of which record and when. Not custom code. Explicit code paths for explicit actions. Or you need to copy all the relevant data in advance over to whatever is executing that operation. This is already happening. This cannot be abstracted away entirely (although tools like TaskFlow help), as this is specific to the case you are handling. And it's not easy to find all the places where you can have a race condition like that -- especially when you are modifying existing code that has been relying on soft delete before. You can have bugs undetected for years, that only appear in production, on very large deployments, and are impossible to reproduce reliably. There are more similar cases like that, including cascading deletes and more advanced stuff, but I think this single case already shows that the advantages of soft delete out-weight its disadvantages. I respectfully disagree :) I think the benefits of explicit code paths and increased performance of the database outweigh the costs of changing existing code. Best, -jay On 13/03/14 19:52, Boris Pavlovic wrote: Hi all, I would like to fix direction of this thread. Cause it is going in wrong direction. To assume: 1) Yes restoring already deleted recourses could be useful. 2) Current approach with soft deletion is broken by design and we should get rid of them. More about why I think that it is broken: 1) When you are restoring some resource you should restore N records from N tables (e.g. VM) 2) Restoring sometimes means not only restoring DB records. 3) Not all resources should be restorable (e.g. why I need to restore fixed_ip? or key-pairs?) So what we should think about is: 1) How to implement restoring functionally in common way (e.g. framework
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Off topic but, I'd like to see a word doc written out with the history of the cloud, that'd be pretty sweet. Especially if its something like google docs where u can watch the changes happen in realtime. +2 From: Jay Pipes jaypi...@gmail.commailto:jaypi...@gmail.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Date: Friday, March 14, 2014 at 7:55 AM To: openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) On Fri, 2014-03-14 at 08:37 +0100, Radomir Dopieralski wrote: Hello, I also think that this thread is going in the wrong direction, but I don't think the direction Boris wants is the correct one either. Frankly I'm a little surprised that nobody mentioned another advantage that soft delete gives us, the one that I think it was actually used for originally. You see, soft delete is an optimization. It's there to make the system work faster as a whole, have less code and be simpler to maintain and debug. How does it do it, when, as clearly shown in the first post in this thread, it makes the queries slower, requires additional indices in the database and more logic in the queries? I feel it isn't an optimization if: * It slows down the code base * Makes the code harder to read and understand * Deliberately obscures the actions of removing and restoring resources * Encourages the idea that everything in the system is undoable, like the cloud is a Word doc. The answer is, by doing more with those queries, by making you write less code, execute fewer queries to the databases and avoid duplicating the same data in multiple places. Fewer queries does not aklways make faster code, nor does it lead to inherently race-free code. OpenStack is a big, distributed system of multiple databases that sometimes rely on each other and cross-reference their records. It's not uncommon to have some long-running operation started, that uses some data, and then, in the middle of its execution, have that data deleted. With soft delete, that's not a problem -- the operation can continue safely and proceed as scheduled, with the data it was started with in the first place -- it still has access to the deleted records as if nothing happened. I believe a better solution would be to use Boris' solution and implement safeguards around the delete operation. For instance, not being able to delete an instance that has tasks still running against it. Either that, or implement true task abortion logic that can notify distributed components about the need to stop a running task because either the user wants to delete a resource or simply cancel the operation they began. You simply won't be able to schedule another operation like that with the same data, because it has been soft-deleted and won't pass the validation at the beginning (or even won't appear in the UI or CLI). This solves a lot of race conditions, error handling, additional checks to make sure the record still exists, etc. Sorry, I disagree here. Components that rely on the soft-delete behavior to get the resource data from the database should instead respond to a NotFound that gets raised by aborting their running task. Without soft delete, you need to write custom code every time to handle the case of a record being deleted mid-operation, including all the possible combinations of which record and when. Not custom code. Explicit code paths for explicit actions. Or you need to copy all the relevant data in advance over to whatever is executing that operation. This is already happening. This cannot be abstracted away entirely (although tools like TaskFlow help), as this is specific to the case you are handling. And it's not easy to find all the places where you can have a race condition like that -- especially when you are modifying existing code that has been relying on soft delete before. You can have bugs undetected for years, that only appear in production, on very large deployments, and are impossible to reproduce reliably. There are more similar cases like that, including cascading deletes and more advanced stuff, but I think this single case already shows that the advantages of soft delete out-weight its disadvantages. I respectfully disagree :) I think the benefits of explicit code paths and increased performance of the database outweigh the costs of changing existing code. Best, -jay On 13/03/14 19:52, Boris Pavlovic wrote: Hi all, I would like to fix direction of this thread. Cause it is going in wrong direction. To assume: 1) Yes restoring already deleted recourses could be useful. 2) Current approach with soft deletion is broken by design and we should get rid of them. More about why I think that it is broken: 1) When you
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
I think we need to split the scenarios and focus on the end user experience with the cloud a few come to my mind from the CERN experience (but this may not be all): 1. Accidental deletion of an object (including meta data) 2. Multi-level consistency (such as between Cell API and child instances) 3. Auditing CERN has the scenario 1 at a reasonable frequency. Ultimately, it is due to error by -- A - the openstack administrators themselves B - the delegated project administrators C - users with a non-optimised scope for administrative action D - users who make mistakes It seems that we should handle these as different cases 3 - make sure there is a log entry (ideally off the box) for all operations 2 - up to the component implementers but with the aim to expire deleted entries as soon as reasonable consistency is achieved 1[A-D] - how can we recover from operator/project admin/user error ? I understand that there are differing perspectives from cloud to server consolidation but my cloud users expect that if they create a one-off virtual desktop running Windows for software testing and install a set of software, I don't tell them it was accidentally deleted due to operator error (1A or 1B), you need to re-create it. Tim -Original Message- From: Jay Pipes [mailto:jaypi...@gmail.com] Sent: 14 March 2014 16:55 To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) On Fri, 2014-03-14 at 08:37 +0100, Radomir Dopieralski wrote: Hello, I also think that this thread is going in the wrong direction, but I don't think the direction Boris wants is the correct one either. Frankly I'm a little surprised that nobody mentioned another advantage that soft delete gives us, the one that I think it was actually used for originally. You see, soft delete is an optimization. It's there to make the system work faster as a whole, have less code and be simpler to maintain and debug. How does it do it, when, as clearly shown in the first post in this thread, it makes the queries slower, requires additional indices in the database and more logic in the queries? I feel it isn't an optimization if: * It slows down the code base * Makes the code harder to read and understand * Deliberately obscures the actions of removing and restoring resources * Encourages the idea that everything in the system is undoable, like the cloud is a Word doc. The answer is, by doing more with those queries, by making you write less code, execute fewer queries to the databases and avoid duplicating the same data in multiple places. Fewer queries does not aklways make faster code, nor does it lead to inherently race-free code. OpenStack is a big, distributed system of multiple databases that sometimes rely on each other and cross-reference their records. It's not uncommon to have some long-running operation started, that uses some data, and then, in the middle of its execution, have that data deleted. With soft delete, that's not a problem -- the operation can continue safely and proceed as scheduled, with the data it was started with in the first place -- it still has access to the deleted records as if nothing happened. I believe a better solution would be to use Boris' solution and implement safeguards around the delete operation. For instance, not being able to delete an instance that has tasks still running against it. Either that, or implement true task abortion logic that can notify distributed components about the need to stop a running task because either the user wants to delete a resource or simply cancel the operation they began. You simply won't be able to schedule another operation like that with the same data, because it has been soft-deleted and won't pass the validation at the beginning (or even won't appear in the UI or CLI). This solves a lot of race conditions, error handling, additional checks to make sure the record still exists, etc. Sorry, I disagree here. Components that rely on the soft-delete behavior to get the resource data from the database should instead respond to a NotFound that gets raised by aborting their running task. Without soft delete, you need to write custom code every time to handle the case of a record being deleted mid-operation, including all the possible combinations of which record and when. Not custom code. Explicit code paths for explicit actions. Or you need to copy all the relevant data in advance over to whatever is executing that operation. This is already happening. This cannot be abstracted away entirely (although tools like TaskFlow help), as this is specific to the case you are handling. And it's not easy to find all the places where you can have a race condition like that -- especially when you are modifying existing code that has been
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On 12 March 2014 17:35, Tim Bell tim.b...@cern.ch wrote: And if the same mistake is done for a cinder volume or a trove database ? Deferred deletion for cinder has been proposed, and there have been few objections to it... nobody has put forward code yet, but anybody is welcome to do so. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Hi all, I would like to fix direction of this thread. Cause it is going in wrong direction. To assume: 1) Yes restoring already deleted recourses could be useful. 2) Current approach with soft deletion is broken by design and we should get rid of them. More about why I think that it is broken: 1) When you are restoring some resource you should restore N records from N tables (e.g. VM) 2) Restoring sometimes means not only restoring DB records. 3) Not all resources should be restorable (e.g. why I need to restore fixed_ip? or key-pairs?) So what we should think about is: 1) How to implement restoring functionally in common way (e.g. framework that will be in oslo) 2) Split of work of getting rid of soft deletion in steps (that I already mention): a) remove soft deletion from places where we are not using it b) replace internal code where we are using soft deletion to that framework c) replace API stuff using ceilometer (for logs) or this framework (for restorable stuff) To put in a nutshell: Restoring Delete resources / Delayed Deletion != Soft deletion. Best regards, Boris Pavlovic On Thu, Mar 13, 2014 at 9:21 PM, Mike Wilson geekinu...@gmail.com wrote: For some guests we use the LVM imagebackend and there are times when the guest is deleted on accident. Humans, being what they are, don't back up their files and don't take care of important data, so it is not uncommon to use lvrestore and undelete an instance so that people can get their data. Of course, this is not always possible if the data has been subsequently overwritten. But it is common enough that I imagine most of our operators are familiar with how to do it. So I guess my saying that we do it on a regular basis is not quite accurate. Probably would be better to say that it is not uncommon to do this, but definitely not a daily task or something of that ilk. I have personally undeleted an instance a few times after accidental deletion also. I can't remember the specifics, but I do remember doing it :-). -Mike On Tue, Mar 11, 2014 at 12:46 PM, Johannes Erdfelt johan...@erdfelt.comwrote: On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com wrote: Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. I'm curious, what are you undeleting and why? JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Excerpts from Jay Pipes's message of 2014-03-12 10:58:36 -0700: On Wed, 2014-03-12 at 17:35 +, Tim Bell wrote: And if the same mistake is done for a cinder volume or a trove database ? Snapshots and backups? and bears, oh my! +1, whether it is large data on a volume or saved state in the RAM of a compute node, it isn't safe unless it is duplicated. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Excerpts from Tim Bell's message of 2014-03-12 11:02:25 -0700: If you want to archive images per-say, on deletion just export it to a 'backup tape' (for example) and store enough of the metadata on that 'tape' to re-insert it if this is really desired and then delete it from the database (or do the export... asynchronously). The same could be said with VMs, although likely not all resources, aka networks/.../ make sense to do this. So instead of deleted = 1, wait for cleaner, just save the resource (if possible) + enough metadata on some other system ('backup tape', alternate storage location, hdfs, ceph...) and leave it there unless it's really needed. Making the database more complex (and all associated code) to achieve this same goal seems like a hack that just needs to be addressed with a better way to do archiving. In a cloudy world of course people would be able to recreate everything they need on-demand so who needs undelete anyway ;-) I have no problem if there was an existing process integrated into all of the OpenStack components which would produce me an archive trail with meta data and a command to recover the object from that data. Currently, my understanding is that there is no such function and thus the proposal to remove the deleted column is premature. That seems like an unreasonable request of low level tools like Nova. End user applications and infrastructure management should be responsible for these things and will do a much better job of it, as you can work your own business needs for reliability and recovery speed into an application aware solution. If Nova does it, your cloud just has to provide everybody with the same un-delete, which is probably overkill for _many_ applications. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
The restore use case is for sure inconsistently implemented and used. I think I agree with Boris that we treat it as separate and just move on with cleaning up soft delete. I imagine most deployments don't like having most of the rows in their table be useless and make db access slow? That being said, I am a little sad my hacky restore method will need to be reworked :-). -Mike On Thu, Mar 13, 2014 at 1:30 PM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Tim Bell's message of 2014-03-12 11:02:25 -0700: If you want to archive images per-say, on deletion just export it to a 'backup tape' (for example) and store enough of the metadata on that 'tape' to re-insert it if this is really desired and then delete it from the database (or do the export... asynchronously). The same could be said with VMs, although likely not all resources, aka networks/.../ make sense to do this. So instead of deleted = 1, wait for cleaner, just save the resource (if possible) + enough metadata on some other system ('backup tape', alternate storage location, hdfs, ceph...) and leave it there unless it's really needed. Making the database more complex (and all associated code) to achieve this same goal seems like a hack that just needs to be addressed with a better way to do archiving. In a cloudy world of course people would be able to recreate everything they need on-demand so who needs undelete anyway ;-) I have no problem if there was an existing process integrated into all of the OpenStack components which would produce me an archive trail with meta data and a command to recover the object from that data. Currently, my understanding is that there is no such function and thus the proposal to remove the deleted column is premature. That seems like an unreasonable request of low level tools like Nova. End user applications and infrastructure management should be responsible for these things and will do a much better job of it, as you can work your own business needs for reliability and recovery speed into an application aware solution. If Nova does it, your cloud just has to provide everybody with the same un-delete, which is probably overkill for _many_ applications. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On 03/11/2014 11:10 PM, Jay Pipes wrote: On Wed, 2014-03-12 at 01:47 +, Joshua Harlow wrote: The question that I don't understand is why does this process have to be involve the database to begin with? If you want to archive images per-say, on deletion just export it to a 'backup tape' (for example) and store enough of the metadata on that 'tape' to re-insert it if this is really desired and then delete it from the database (or do the export... asynchronously). The same could be said with VMs, although likely not all resources, aka networks/.../ make sense to do this. So instead of deleted = 1, wait for cleaner, just save the resource (if possible) + enough metadata on some other system ('backup tape', alternate storage location, hdfs, ceph...) and leave it there unless it's really needed. Making the database more complex (and all associated code) to achieve this same goal seems like a hack that just needs to be addressed with a better way to do archiving. In a cloudy world of course people would be able to recreate everything they need on-demand so who needs undelete anyway ;-) Good points. Another way to ask the question: does Amazon provide an undelete functionality? Man, if that was our threshold for doing things, we could delete a ton of OpenStack code. :P Honestly, I think it's important to realize that a very large OpenStack deploy has found undelete *really useful*. Perhaps the current way they are doing it is hacky based on something that wasn't intended, so we should have a more explicit undelete method, however I don't think it's immediately invalid because AWS doesn't do it. Back to the original question, my, limited, understanding is soft delete is there for eventual consistency. It means that everyone doesn't need a current view of the database all the time, and that a bunch of operations don't need to sit under transactions. Like the ability to do a list_images while and image is being deleted, especially given that we denormalize a lot of these things over multiple tables. That, however, may be an out of date concept here. I haven't stayed up on everything in that piece of the stack. -Sean -- Sean Dague Samsung Research America s...@dague.net / sean.da...@samsung.com http://dague.net signature.asc Description: OpenPGP digital signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
I have personally witnessed someone (honestly, not me) select Terminate Instance when they meant Reboot Instance and that mistake is way too easy. I'm not sure if it was a brain mistake or mere slip of the mouse, but it's enough to make people really nervous in a production environment. If there's one thing you can count on about human beings, it's that they'll make mistakes sooner or later. Any system that assumes infallible human beings as a design criteria is making an invalid assumption. -- Paul Carver VO: 732-545-7377 Cell: 908-803-1656 E: pcar...@att.com Q Instant Message -Original Message- From: Tim Bell [mailto:tim.b...@cern.ch] Sent: Tuesday, March 11, 2014 15:43 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) Typical cases are user error where someone accidentally deletes an item from a tenant. The image guys have a good structure where images become unavailable and are recoverable for a certain period of time. A regular periodic task cleans up deleted items after a configurable number of seconds to avoid constant database growth. My preference would be to follow this model universally (an archive table is a nice way to do it without disturbing production). Tim On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com wrote: Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. I'm curious, what are you undeleting and why? JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Understandable, Humans will be humans after all. To me if openstsck is a cloud platform then coming along with it should be best practices that come with the usage of a cloud platform (treat your instances as ephemeral, use configuration management, save your stuff in source control...). I have been preaching similar stuff at y! and getting people into the right mindset around the cloud is IMHO more important than making openstack fit peoples non-cloudy mindset. Because once u teach a person to use the cloud right u don't need to have openstack compensate for them using it incorrectly. Sent from my really tiny device... On Mar 12, 2014, at 4:45 AM, CARVER, PAUL pc2...@att.com wrote: I have personally witnessed someone (honestly, not me) select Terminate Instance when they meant Reboot Instance and that mistake is way too easy. I'm not sure if it was a brain mistake or mere slip of the mouse, but it's enough to make people really nervous in a production environment. If there's one thing you can count on about human beings, it's that they'll make mistakes sooner or later. Any system that assumes infallible human beings as a design criteria is making an invalid assumption. -- Paul Carver VO: 732-545-7377 Cell: 908-803-1656 E: pcar...@att.com Q Instant Message -Original Message- From: Tim Bell [mailto:tim.b...@cern.ch] Sent: Tuesday, March 11, 2014 15:43 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) Typical cases are user error where someone accidentally deletes an item from a tenant. The image guys have a good structure where images become unavailable and are recoverable for a certain period of time. A regular periodic task cleans up deleted items after a configurable number of seconds to avoid constant database growth. My preference would be to follow this model universally (an archive table is a nice way to do it without disturbing production). Tim On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com wrote: Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. I'm curious, what are you undeleting and why? JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On Wed, 2014-03-12 at 11:37 +, CARVER, PAUL wrote: I have personally witnessed someone (honestly, not me) select Terminate Instance when they meant Reboot Instance and that mistake is way too easy. I'm not sure if it was a brain mistake or mere slip of the mouse, but it's enough to make people really nervous in a production environment. If there's one thing you can count on about human beings, it's that they'll make mistakes sooner or later. Any system that assumes infallible human beings as a design criteria is making an invalid assumption. That's why GUIs should have a dialog box that says Are you sure you want to terminate this server?. There's prevention of common mistakes, and then there's going out of your way to ensure that the cloud acts like a text editor with an unlimiited undo buffer. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On Wed, Mar 12, 2014, CARVER, PAUL pc2...@att.com wrote: I have personally witnessed someone (honestly, not me) select Terminate Instance when they meant Reboot Instance and that mistake is way too easy. I'm not sure if it was a brain mistake or mere slip of the mouse, but it's enough to make people really nervous in a production environment. If there's one thing you can count on about human beings, it's that they'll make mistakes sooner or later. Any system that assumes infallible human beings as a design criteria is making an invalid assumption. I think there might be some confusion about what soft-delete we're talking about. Nova has two orthogonal soft-delete features: 1) Database rows are never deleted from the database. They are just marked as deleted via a column. This is unexposed to users and is an implementation detail in the current code. 2) Instance deletion can be deferred until a later time. This is called deferred-delete and soft-delete in the code. If the feature is enabled and the instance that has't been reclaimed, it can be restored with the 'nova restore' command. This thread is about the database soft-delete and not the instance soft-delete. JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
And if the same mistake is done for a cinder volume or a trove database ? Tim -Original Message- From: Joshua Harlow [mailto:harlo...@yahoo-inc.com] Sent: 12 March 2014 17:02 To: OpenStack Development Mailing List (not for usage questions) Cc: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) Understandable, Humans will be humans after all. To me if openstsck is a cloud platform then coming along with it should be best practices that come with the usage of a cloud platform (treat your instances as ephemeral, use configuration management, save your stuff in source control...). I have been preaching similar stuff at y! and getting people into the right mindset around the cloud is IMHO more important than making openstack fit peoples non-cloudy mindset. Because once u teach a person to use the cloud right u don't need to have openstack compensate for them using it incorrectly. Sent from my really tiny device... On Mar 12, 2014, at 4:45 AM, CARVER, PAUL pc2...@att.com wrote: I have personally witnessed someone (honestly, not me) select Terminate Instance when they meant Reboot Instance and that mistake is way too easy. I'm not sure if it was a brain mistake or mere slip of the mouse, but it's enough to make people really nervous in a production environment. If there's one thing you can count on about human beings, it's that they'll make mistakes sooner or later. Any system that assumes infallible human beings as a design criteria is making an invalid assumption. -- Paul Carver VO: 732-545-7377 Cell: 908-803-1656 E: pcar...@att.com Q Instant Message -Original Message- From: Tim Bell [mailto:tim.b...@cern.ch] Sent: Tuesday, March 11, 2014 15:43 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) Typical cases are user error where someone accidentally deletes an item from a tenant. The image guys have a good structure where images become unavailable and are recoverable for a certain period of time. A regular periodic task cleans up deleted items after a configurable number of seconds to avoid constant database growth. My preference would be to follow this model universally (an archive table is a nice way to do it without disturbing production). Tim On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com wrote: Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. I'm curious, what are you undeleting and why? JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On Wed, 2014-03-12 at 17:35 +, Tim Bell wrote: And if the same mistake is done for a cinder volume or a trove database ? Snapshots and backups? Best, -jay -Original Message- From: Joshua Harlow [mailto:harlo...@yahoo-inc.com] Sent: 12 March 2014 17:02 To: OpenStack Development Mailing List (not for usage questions) Cc: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) Understandable, Humans will be humans after all. To me if openstsck is a cloud platform then coming along with it should be best practices that come with the usage of a cloud platform (treat your instances as ephemeral, use configuration management, save your stuff in source control...). I have been preaching similar stuff at y! and getting people into the right mindset around the cloud is IMHO more important than making openstack fit peoples non-cloudy mindset. Because once u teach a person to use the cloud right u don't need to have openstack compensate for them using it incorrectly. Sent from my really tiny device... On Mar 12, 2014, at 4:45 AM, CARVER, PAUL pc2...@att.com wrote: I have personally witnessed someone (honestly, not me) select Terminate Instance when they meant Reboot Instance and that mistake is way too easy. I'm not sure if it was a brain mistake or mere slip of the mouse, but it's enough to make people really nervous in a production environment. If there's one thing you can count on about human beings, it's that they'll make mistakes sooner or later. Any system that assumes infallible human beings as a design criteria is making an invalid assumption. -- Paul Carver VO: 732-545-7377 Cell: 908-803-1656 E: pcar...@att.com Q Instant Message -Original Message- From: Tim Bell [mailto:tim.b...@cern.ch] Sent: Tuesday, March 11, 2014 15:43 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) Typical cases are user error where someone accidentally deletes an item from a tenant. The image guys have a good structure where images become unavailable and are recoverable for a certain period of time. A regular periodic task cleans up deleted items after a configurable number of seconds to avoid constant database growth. My preference would be to follow this model universally (an archive table is a nice way to do it without disturbing production). Tim On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com wrote: Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. I'm curious, what are you undeleting and why? JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
If you want to archive images per-say, on deletion just export it to a 'backup tape' (for example) and store enough of the metadata on that 'tape' to re-insert it if this is really desired and then delete it from the database (or do the export... asynchronously). The same could be said with VMs, although likely not all resources, aka networks/.../ make sense to do this. So instead of deleted = 1, wait for cleaner, just save the resource (if possible) + enough metadata on some other system ('backup tape', alternate storage location, hdfs, ceph...) and leave it there unless it's really needed. Making the database more complex (and all associated code) to achieve this same goal seems like a hack that just needs to be addressed with a better way to do archiving. In a cloudy world of course people would be able to recreate everything they need on-demand so who needs undelete anyway ;-) I have no problem if there was an existing process integrated into all of the OpenStack components which would produce me an archive trail with meta data and a command to recover the object from that data. Currently, my understanding is that there is no such function and thus the proposal to remove the deleted column is premature. Tim ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Roman Podoliaka said on Mon, Mar 10, 2014 at 03:04:06PM -0700: So we have a homework to do: find out what for projects use soft-deletes. I assume that soft-deletes are only used internally and aren't exposed to API users, but let's check that. At the same time all new projects should avoid using of soft-deletes from the start. On that homework, deleted records can be interesting when aggregating over time. For example, nodes where over 100 instances went to ERROR this month or nodes that hosted flavor FLAVOR this month. Operators might have written plugins to test these business concerns, so although Ceilometer might be a better place to get that information, the transition should be considered. Alexis -- Nova Engineer, HP Cloud. AKA lealexis, lxsli. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
If the deleted column is removed, how would the 'undelete' functionality be provided ? This saves operators when user accidents occur since restoring the whole database to a point in time affects the other tenants also. Tim Hi all, I've never understood why we treat the DB as a LOG (keeping deleted == 0 records around) when we should just use a LOG (or similar system) to begin with instead. I can't agree more with you! Storing deleted records in tables is hardly usable, bad for performance (as it makes tables and indexes larger) and it probably covers a very limited set of use cases (if any) of OpenStack users. If the deleted column is removed, how would the 'undelete' functionality be provided ? This saves operators when user accidents occur since restoring the whole database to a point in time affects the other tenants also. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. -Mike On Mon, Mar 10, 2014 at 3:44 PM, Joshua Harlow harlo...@yahoo-inc.comwrote: Sounds like a good idea to me. I've never understood why we treat the DB as a LOG (keeping deleted == 0 records around) when we should just use a LOG (or similar system) to begin with instead. Does anyone use the feature of switching deleted == 1 back to deleted = 0? Has this worked out for u? Seems like some of the feedback on https://etherpad.openstack.org/p/operators-feedback-mar14 also suggests that this has been a operational pain-point for folks (Tool to delete things properly suggestions and such...). From: Boris Pavlovic bpavlo...@mirantis.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: Monday, March 10, 2014 at 1:29 PM To: OpenStack Development Mailing List openstack-dev@lists.openstack.org, Victor Sergeyev vserge...@mirantis.com Subject: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) Hi stackers, (It's proposal for Juno.) Intro: Soft deletion means that records from DB are not actually deleted, they are just marked as a deleted. To mark record as a deleted we put in special table's column deleted record's ID value. Issue 1: Indexes Queries We have to add in every query AND deleted == 0 to get non-deleted records. It produce performance issue, cause we should add it in any index one extra column. As well it produce extra complexity in db migrations and building queries. Issue 2: Unique constraints Why we store ID in deleted and not True/False? The reason is that we would like to be able to create real DB unique constraints and avoid race conditions on insert operation. Sample: we Have table (id, name, password, deleted) we would like to put in column name only unique value. Approach without UC: if count(`select where name = name`) == 0: insert(...) (race cause we are able to add new record between ) Approach with UC: try: insert(...) except Duplicate: ... So to add UC we have to add them on (name, deleted). (to be able to make insert/delete/insert with same name) As well it produce performance issues, because we have to use Complex unique constraints on 2 or more columns. + extra code complexity in db migrations. Issue 3: Garbage collector It is really hard to make garbage collector that will have good performance and be enough common to work in any case for any project. Without garbage collector DevOps have to cleanup records by hand, (risk to break something). If they don't cleanup DB they will get very soon performance issue. To put in a nutshell most important issues: 1) Extra complexity to each select query extra column in each index 2) Extra column in each Unique Constraint (worse performance) 3) 2 Extra column in each table: (deleted, deleted_at) 4) Common garbage collector is required To resolve all these issues we should just remove soft deletion. One of approaches that I see is in step by step removing deleted column from every table with probably code refactoring. Actually we have 3 different cases: 1) We don't use soft deleted records: 1.1) Do .delete() instead of .soft_delete() 1.2) Change query to avoid adding extra deleted == 0 to each query 1.3) Drop deleted and deleted_at columns 2) We use soft deleted records for internal stuff e.g. periodic tasks 2.1) Refactor code somehow: E.g. store all required data by periodic task in some special table that has: (id, type, json_data) columns 2.2) On delete add record to this table 2.3-5) similar to 1.1, 1.2, 13 3) We use soft deleted records in API 3.1) Deprecated API call if it is possible 3.2) Make proxy call to ceilometer from API 3.3) On .delete() store info about records in (ceilometer, or somewhere else) 3.4-6) similar to 1.1, 1.2, 1.3 This is not ready RoadMap, just base thoughts to start the constructive discussion in the mailing list, so %stacker% your opinion is very important! Best regards, Boris Pavlovic ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On Tue, Mar 11, 2014 at 10:24 AM, Mike Wilson geekinu...@gmail.com wrote: Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. Although we want to *support* hard deletion, we still want to support the current behavior as well (Soft deletion, where the operator, can prune deleted rows periodically). -Mike On Mon, Mar 10, 2014 at 3:44 PM, Joshua Harlow harlo...@yahoo-inc.comwrote: Sounds like a good idea to me. I've never understood why we treat the DB as a LOG (keeping deleted == 0 records around) when we should just use a LOG (or similar system) to begin with instead. Does anyone use the feature of switching deleted == 1 back to deleted = 0? Has this worked out for u? Seems like some of the feedback on https://etherpad.openstack.org/p/operators-feedback-mar14 also suggests that this has been a operational pain-point for folks (Tool to delete things properly suggestions and such...). From: Boris Pavlovic bpavlo...@mirantis.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: Monday, March 10, 2014 at 1:29 PM To: OpenStack Development Mailing List openstack-dev@lists.openstack.org, Victor Sergeyev vserge...@mirantis.com Subject: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) Hi stackers, (It's proposal for Juno.) Intro: Soft deletion means that records from DB are not actually deleted, they are just marked as a deleted. To mark record as a deleted we put in special table's column deleted record's ID value. Issue 1: Indexes Queries We have to add in every query AND deleted == 0 to get non-deleted records. It produce performance issue, cause we should add it in any index one extra column. As well it produce extra complexity in db migrations and building queries. Issue 2: Unique constraints Why we store ID in deleted and not True/False? The reason is that we would like to be able to create real DB unique constraints and avoid race conditions on insert operation. Sample: we Have table (id, name, password, deleted) we would like to put in column name only unique value. Approach without UC: if count(`select where name = name`) == 0: insert(...) (race cause we are able to add new record between ) Approach with UC: try: insert(...) except Duplicate: ... So to add UC we have to add them on (name, deleted). (to be able to make insert/delete/insert with same name) As well it produce performance issues, because we have to use Complex unique constraints on 2 or more columns. + extra code complexity in db migrations. Issue 3: Garbage collector It is really hard to make garbage collector that will have good performance and be enough common to work in any case for any project. Without garbage collector DevOps have to cleanup records by hand, (risk to break something). If they don't cleanup DB they will get very soon performance issue. To put in a nutshell most important issues: 1) Extra complexity to each select query extra column in each index 2) Extra column in each Unique Constraint (worse performance) 3) 2 Extra column in each table: (deleted, deleted_at) 4) Common garbage collector is required To resolve all these issues we should just remove soft deletion. One of approaches that I see is in step by step removing deleted column from every table with probably code refactoring. Actually we have 3 different cases: 1) We don't use soft deleted records: 1.1) Do .delete() instead of .soft_delete() 1.2) Change query to avoid adding extra deleted == 0 to each query 1.3) Drop deleted and deleted_at columns 2) We use soft deleted records for internal stuff e.g. periodic tasks 2.1) Refactor code somehow: E.g. store all required data by periodic task in some special table that has: (id, type, json_data) columns 2.2) On delete add record to this table 2.3-5) similar to 1.1, 1.2, 13 3) We use soft deleted records in API 3.1) Deprecated API call if it is possible 3.2) Make proxy call to ceilometer from API 3.3) On .delete() store info about records in (ceilometer, or somewhere else) 3.4-6) similar to 1.1, 1.2, 1.3 This is not ready RoadMap, just base thoughts to start the constructive discussion in the mailing list, so %stacker% your opinion is very important! Best regards, Boris Pavlovic ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com wrote: Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. I'm curious, what are you undeleting and why? JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Typical cases are user error where someone accidentally deletes an item from a tenant. The image guys have a good structure where images become unavailable and are recoverable for a certain period of time. A regular periodic task cleans up deleted items after a configurable number of seconds to avoid constant database growth. My preference would be to follow this model universally (an archive table is a nice way to do it without disturbing production). Tim On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com wrote: Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. I'm curious, what are you undeleting and why? JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On Tue, Mar 11, 2014 at 12:43 PM, Tim Bell tim.b...@cern.ch wrote: Typical cases are user error where someone accidentally deletes an item from a tenant. The image guys have a good structure where images become unavailable and are recoverable for a certain period of time. A regular periodic task cleans up deleted items after a configurable number of seconds to avoid constant database growth. My preference would be to follow this model universally (an archive table is a nice way to do it without disturbing production). That was the goal of the shadow table, if it doesn't support that now then its a bug. Tim On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com wrote: Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. I'm curious, what are you undeleting and why? JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Can we therefore make that no removal of deleted column is permitted if there is no implementation of shadow tables ? Tim From: Joe Gordon [mailto:joe.gord...@gmail.com] Sent: 11 March 2014 20:57 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) On Tue, Mar 11, 2014 at 12:43 PM, Tim Bell tim.b...@cern.chmailto:tim.b...@cern.ch wrote: Typical cases are user error where someone accidentally deletes an item from a tenant. The image guys have a good structure where images become unavailable and are recoverable for a certain period of time. A regular periodic task cleans up deleted items after a configurable number of seconds to avoid constant database growth. My preference would be to follow this model universally (an archive table is a nice way to do it without disturbing production). That was the goal of the shadow table, if it doesn't support that now then its a bug. Tim On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.commailto:geekinu...@gmail.com wrote: Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. I'm curious, what are you undeleting and why? JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
The question that I don't understand is why does this process have to be involve the database to begin with? If you want to archive images per-say, on deletion just export it to a 'backup tape' (for example) and store enough of the metadata on that 'tape' to re-insert it if this is really desired and then delete it from the database (or do the export... asynchronously). The same could be said with VMs, although likely not all resources, aka networks/.../ make sense to do this. So instead of deleted = 1, wait for cleaner, just save the resource (if possible) + enough metadata on some other system ('backup tape', alternate storage location, hdfs, ceph...) and leave it there unless it's really needed. Making the database more complex (and all associated code) to achieve this same goal seems like a hack that just needs to be addressed with a better way to do archiving. In a cloudy world of course people would be able to recreate everything they need on-demand so who needs undelete anyway ;-) My 0.02 cents. -Original Message- From: Tim Bell tim.b...@cern.ch Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: Tuesday, March 11, 2014 at 11:43 AM To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) Typical cases are user error where someone accidentally deletes an item from a tenant. The image guys have a good structure where images become unavailable and are recoverable for a certain period of time. A regular periodic task cleans up deleted items after a configurable number of seconds to avoid constant database growth. My preference would be to follow this model universally (an archive table is a nice way to do it without disturbing production). Tim On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com wrote: Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored. I'm curious, what are you undeleting and why? JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On Wed, 2014-03-12 at 01:47 +, Joshua Harlow wrote: The question that I don't understand is why does this process have to be involve the database to begin with? If you want to archive images per-say, on deletion just export it to a 'backup tape' (for example) and store enough of the metadata on that 'tape' to re-insert it if this is really desired and then delete it from the database (or do the export... asynchronously). The same could be said with VMs, although likely not all resources, aka networks/.../ make sense to do this. So instead of deleted = 1, wait for cleaner, just save the resource (if possible) + enough metadata on some other system ('backup tape', alternate storage location, hdfs, ceph...) and leave it there unless it's really needed. Making the database more complex (and all associated code) to achieve this same goal seems like a hack that just needs to be addressed with a better way to do archiving. In a cloudy world of course people would be able to recreate everything they need on-demand so who needs undelete anyway ;-) Good points. Another way to ask the question: does Amazon provide an undelete functionality? Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Hi stackers, (It's proposal for Juno.) Intro: Soft deletion means that records from DB are not actually deleted, they are just marked as a deleted. To mark record as a deleted we put in special table's column deleted record's ID value. Issue 1: Indexes Queries We have to add in every query AND deleted == 0 to get non-deleted records. It produce performance issue, cause we should add it in any index one extra column. As well it produce extra complexity in db migrations and building queries. Issue 2: Unique constraints Why we store ID in deleted and not True/False? The reason is that we would like to be able to create real DB unique constraints and avoid race conditions on insert operation. Sample: we Have table (id, name, password, deleted) we would like to put in column name only unique value. Approach without UC: if count(`select where name = name`) == 0: insert(...) (race cause we are able to add new record between ) Approach with UC: try: insert(...) except Duplicate: ... So to add UC we have to add them on (name, deleted). (to be able to make insert/delete/insert with same name) As well it produce performance issues, because we have to use Complex unique constraints on 2 or more columns. + extra code complexity in db migrations. Issue 3: Garbage collector It is really hard to make garbage collector that will have good performance and be enough common to work in any case for any project. Without garbage collector DevOps have to cleanup records by hand, (risk to break something). If they don't cleanup DB they will get very soon performance issue. To put in a nutshell most important issues: 1) Extra complexity to each select query extra column in each index 2) Extra column in each Unique Constraint (worse performance) 3) 2 Extra column in each table: (deleted, deleted_at) 4) Common garbage collector is required To resolve all these issues we should just remove soft deletion. One of approaches that I see is in step by step removing deleted column from every table with probably code refactoring. Actually we have 3 different cases: 1) We don't use soft deleted records: 1.1) Do .delete() instead of .soft_delete() 1.2) Change query to avoid adding extra deleted == 0 to each query 1.3) Drop deleted and deleted_at columns 2) We use soft deleted records for internal stuff e.g. periodic tasks 2.1) Refactor code somehow: E.g. store all required data by periodic task in some special table that has: (id, type, json_data) columns 2.2) On delete add record to this table 2.3-5) similar to 1.1, 1.2, 13 3) We use soft deleted records in API 3.1) Deprecated API call if it is possible 3.2) Make proxy call to ceilometer from API 3.3) On .delete() store info about records in (ceilometer, or somewhere else) 3.4-6) similar to 1.1, 1.2, 1.3 This is not ready RoadMap, just base thoughts to start the constructive discussion in the mailing list, so %stacker% your opinion is very important! Best regards, Boris Pavlovic ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Sounds like a good idea to me. I've never understood why we treat the DB as a LOG (keeping deleted == 0 records around) when we should just use a LOG (or similar system) to begin with instead. Does anyone use the feature of switching deleted == 1 back to deleted = 0? Has this worked out for u? Seems like some of the feedback on https://etherpad.openstack.org/p/operators-feedback-mar14 also suggests that this has been a operational pain-point for folks (Tool to delete things properly suggestions and such…). From: Boris Pavlovic bpavlo...@mirantis.commailto:bpavlo...@mirantis.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Date: Monday, March 10, 2014 at 1:29 PM To: OpenStack Development Mailing List openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org, Victor Sergeyev vserge...@mirantis.commailto:vserge...@mirantis.com Subject: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) Hi stackers, (It's proposal for Juno.) Intro: Soft deletion means that records from DB are not actually deleted, they are just marked as a deleted. To mark record as a deleted we put in special table's column deleted record's ID value. Issue 1: Indexes Queries We have to add in every query AND deleted == 0 to get non-deleted records. It produce performance issue, cause we should add it in any index one extra column. As well it produce extra complexity in db migrations and building queries. Issue 2: Unique constraints Why we store ID in deleted and not True/False? The reason is that we would like to be able to create real DB unique constraints and avoid race conditions on insert operation. Sample: we Have table (id, name, password, deleted) we would like to put in column name only unique value. Approach without UC: if count(`select where name = name`) == 0: insert(...) (race cause we are able to add new record between ) Approach with UC: try: insert(...) except Duplicate: ... So to add UC we have to add them on (name, deleted). (to be able to make insert/delete/insert with same name) As well it produce performance issues, because we have to use Complex unique constraints on 2 or more columns. + extra code complexity in db migrations. Issue 3: Garbage collector It is really hard to make garbage collector that will have good performance and be enough common to work in any case for any project. Without garbage collector DevOps have to cleanup records by hand, (risk to break something). If they don't cleanup DB they will get very soon performance issue. To put in a nutshell most important issues: 1) Extra complexity to each select query extra column in each index 2) Extra column in each Unique Constraint (worse performance) 3) 2 Extra column in each table: (deleted, deleted_at) 4) Common garbage collector is required To resolve all these issues we should just remove soft deletion. One of approaches that I see is in step by step removing deleted column from every table with probably code refactoring. Actually we have 3 different cases: 1) We don't use soft deleted records: 1.1) Do .delete() instead of .soft_delete() 1.2) Change query to avoid adding extra deleted == 0 to each query 1.3) Drop deleted and deleted_at columns 2) We use soft deleted records for internal stuff e.g. periodic tasks 2.1) Refactor code somehow: E.g. store all required data by periodic task in some special table that has: (id, type, json_data) columns 2.2) On delete add record to this table 2.3-5) similar to 1.1, 1.2, 13 3) We use soft deleted records in API 3.1) Deprecated API call if it is possible 3.2) Make proxy call to ceilometer from API 3.3) On .delete() store info about records in (ceilometer, or somewhere else) 3.4-6) similar to 1.1, 1.2, 1.3 This is not ready RoadMap, just base thoughts to start the constructive discussion in the mailing list, so %stacker% your opinion is very important! Best regards, Boris Pavlovic ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On Tue, 2014-03-11 at 01:29 +0400, Boris Pavlovic wrote: snip To put in a nutshell most important issues: 1) Extra complexity to each select query extra column in each index 2) Extra column in each Unique Constraint (worse performance) 3) 2 Extra column in each table: (deleted, deleted_at) 4) Common garbage collector is required Nice summary of the problems related to soft deletion, Boris. To resolve all these issues we should just remove soft deletion. One of approaches that I see is in step by step removing deleted column from every table with probably code refactoring. Actually we have 3 different cases: 1) We don't use soft deleted records: 1.1) Do .delete() instead of .soft_delete() 1.2) Change query to avoid adding extra deleted == 0 to each query 1.3) Drop deleted and deleted_at columns 2) We use soft deleted records for internal stuff e.g. periodic tasks 2.1) Refactor code somehow: E.g. store all required data by periodic task in some special table that has: (id, type, json_data) columns 2.2) On delete add record to this table 2.3-5) similar to 1.1, 1.2, 13 3) We use soft deleted records in API 3.1) Deprecated API call if it is possible 3.2) Make proxy call to ceilometer from API 3.3) On .delete() store info about records in (ceilometer, or somewhere else) 3.4-6) similar to 1.1, 1.2, 1.3 I would actually prefer this solution, at least for server instances: 1. Remove any contractual obligation in the API to allow servers with the same name to exist, as long as only one of those servers is not deleted. As I've mentioned before, I think this is exceedingly silly to slow down the operation of Nova just to allow a user to create a server, delete it, and immediately create a server with the same name. 2. Make the unique constraint for the server name be on (project_id, name) and be done with it. 3. Remove deleted and deleted_at from the instances table. 4. Don't allow any delete() operation at all on the nova.objects.instance object at all. 3. Hard delete records from the instances table on a periodic basis using an external archiver that either just deletes the records in instances that are in ERROR or TERMINATED vm_state (as is possible if ceilometer is providing your bookkeeping) or move those records into an archival table (as would be necessary if you are not running Ceilometer and need some history of these things). For other objects in the system, I think your solution #1 would work fine. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Hi all, I've never understood why we treat the DB as a LOG (keeping deleted == 0 records around) when we should just use a LOG (or similar system) to begin with instead. I can't agree more with you! Storing deleted records in tables is hardly usable, bad for performance (as it makes tables and indexes larger) and it probably covers a very limited set of use cases (if any) of OpenStack users. One of approaches that I see is in step by step removing deleted column from every table with probably code refactoring. So we have a homework to do: find out what for projects use soft-deletes. I assume that soft-deletes are only used internally and aren't exposed to API users, but let's check that. At the same time all new projects should avoid using of soft-deletes from the start. Thanks, Roman On Mon, Mar 10, 2014 at 2:44 PM, Joshua Harlow harlo...@yahoo-inc.com wrote: Sounds like a good idea to me. I've never understood why we treat the DB as a LOG (keeping deleted == 0 records around) when we should just use a LOG (or similar system) to begin with instead. Does anyone use the feature of switching deleted == 1 back to deleted = 0? Has this worked out for u? Seems like some of the feedback on https://etherpad.openstack.org/p/operators-feedback-mar14 also suggests that this has been a operational pain-point for folks (Tool to delete things properly suggestions and such…). From: Boris Pavlovic bpavlo...@mirantis.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: Monday, March 10, 2014 at 1:29 PM To: OpenStack Development Mailing List openstack-dev@lists.openstack.org, Victor Sergeyev vserge...@mirantis.com Subject: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step) Hi stackers, (It's proposal for Juno.) Intro: Soft deletion means that records from DB are not actually deleted, they are just marked as a deleted. To mark record as a deleted we put in special table's column deleted record's ID value. Issue 1: Indexes Queries We have to add in every query AND deleted == 0 to get non-deleted records. It produce performance issue, cause we should add it in any index one extra column. As well it produce extra complexity in db migrations and building queries. Issue 2: Unique constraints Why we store ID in deleted and not True/False? The reason is that we would like to be able to create real DB unique constraints and avoid race conditions on insert operation. Sample: we Have table (id, name, password, deleted) we would like to put in column name only unique value. Approach without UC: if count(`select where name = name`) == 0: insert(...) (race cause we are able to add new record between ) Approach with UC: try: insert(...) except Duplicate: ... So to add UC we have to add them on (name, deleted). (to be able to make insert/delete/insert with same name) As well it produce performance issues, because we have to use Complex unique constraints on 2 or more columns. + extra code complexity in db migrations. Issue 3: Garbage collector It is really hard to make garbage collector that will have good performance and be enough common to work in any case for any project. Without garbage collector DevOps have to cleanup records by hand, (risk to break something). If they don't cleanup DB they will get very soon performance issue. To put in a nutshell most important issues: 1) Extra complexity to each select query extra column in each index 2) Extra column in each Unique Constraint (worse performance) 3) 2 Extra column in each table: (deleted, deleted_at) 4) Common garbage collector is required To resolve all these issues we should just remove soft deletion. One of approaches that I see is in step by step removing deleted column from every table with probably code refactoring. Actually we have 3 different cases: 1) We don't use soft deleted records: 1.1) Do .delete() instead of .soft_delete() 1.2) Change query to avoid adding extra deleted == 0 to each query 1.3) Drop deleted and deleted_at columns 2) We use soft deleted records for internal stuff e.g. periodic tasks 2.1) Refactor code somehow: E.g. store all required data by periodic task in some special table that has: (id, type, json_data) columns 2.2) On delete add record to this table 2.3-5) similar to 1.1, 1.2, 13 3) We use soft deleted records in API 3.1) Deprecated API call if it is possible 3.2) Make proxy call to ceilometer from API 3.3) On .delete() store info about records in (ceilometer, or somewhere else) 3.4-6) similar to 1.1, 1.2, 1.3 This is not ready RoadMap, just base thoughts to start the constructive discussion in the mailing list, so %stacker% your opinion is very important! Best regards, Boris Pavlovic ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http
Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
I see many examples in nova of where we still read rows with read_deleted=yes. I think we need to see a plan for how to remove all of those before we can progress this. Michael On Tue, Mar 11, 2014 at 9:06 AM, Johannes Erdfelt johan...@erdfelt.com wrote: On Mon, Mar 10, 2014, Joshua Harlow harlo...@yahoo-inc.com wrote: Sounds like a good idea to me. I generally think this is a good idea too. I've never understood why we treat the DB as a LOG (keeping deleted == 0 records around) when we should just use a LOG (or similar system) to begin with instead. Does anyone use the feature of switching deleted == 1 back to deleted = 0? Has this worked out for u? This isn't the only potential use. It's possible that code depends on being able to still access deleted records. For instance, in the past we could delete an instance_type, but if an instance is still referencing it, code would still try to fetch it from the database some times. This particular example probably isn't an issue anymore since I think all of that has been moved to instance metadata specifically to avoid problems like this. That said, I think it's well worth the effort to simplify the code and make operators lives easier. JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev