vishesh92 commented on PR #8737: URL: https://github.com/apache/cloudstack/pull/8737#issuecomment-1978101928
> Hey, @vishesh92 > > Adding the `vm_stats` -> `timestamp` index to the table might not produce the effect that you want. When deleting very specific rows, adding an index to the table will generally make the deletion much faster. However, when a table has indexes, those must also be updated on deletes (and inserts too); therefore, if you are deleting large parts of a table, such as when deleting all the entries older then x time, your delete query will probably be slower. > > I've actually ran into a situation where a user was retaining the last 30 days of VM statistics, but decided that they wanted to start retaining only 7 days. When changing the retain time from 30 days to 7 days, ACS had to delete 23 days worth of data. Because of the number of VMs and MGMT servers on his enviroment, they had about 1 billion entries to delete. ACS had trouble to perform this operation as the delete was being done in a single query that always timed out; thus the table started to grow endlessly. I've tried to add this exact index you are proposing; but, when I tested deleting 10 million rows of `vm_stats`, the index actually made the delete take double the amount of time. > > I've found another solution to the `vm_stats` delete problem. You can check it out on #8740 @JoaoJandre I understand what you mean. But for day to day operations, the index on `timestamp` will make the operation much faster. The scenario you mentioned will only occur when the operator changes the stats retention interval which IMO doesn't happen too frequently. And with the changes you are suggesting in #8740, the query won't timeout since we are deleting only a small part of the table. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org