vishesh92 commented on PR #8737: URL: https://github.com/apache/cloudstack/pull/8737#issuecomment-1991036038
> > @JoaoJandre We can't run explain analyze with delete commands. Adding the index also helps reduce the load on database while cleaning up the entries. Otherwise the DB will have to go through each record to filter out all the entries which can cause spike in I/O as well. I understand that if the administrator reduces the retention period, it can cause issues but having the index will speed up deletion which runs every minute. > > @vishesh92 you are proposing to add an index to improve DELETE operations; however, you are presenting results for SELECT operations. Indeed, indexes can improve SELECT operations, which only list the data. However, the DELETE operation works different: it removes data, recalculates statistics, reorganizes indexes and more. Please, present some tests and results for DELETE operations; otherwise, it is not feasible to sustain the reason for adding those indexes. @JoaoJandre Here are the results for a delete operation where the filter doesn't matches any rows. ```sql MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01'; Query OK, 0 rows affected Time: 0.001s MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01'; Query OK, 0 rows affected Time: 0.001s ``` query plan ```sql +------------------------------------------------------------------------------------------------+ | EXPLAIN | |------------------------------------------------------------------------------------------------| | { | | "query_block": { | | "select_id": 1, | | "table": { | | "delete": true, | | "table_name": "vm_stats", | | "access_type": "range", | | "possible_keys": [ | | "temp_idx" | | ], | | "key": "temp_idx", | | "used_key_parts": [ | | "timestamp" | | ], | | "key_length": "5", | | "ref": [ | | "const" | | ], | | "rows_examined_per_scan": 1, | | "filtered": "100.00", | | "attached_condition": "(`test`.`vm_stats`.`timestamp` < TIMESTAMP'1950-01-01 00:00:00')" | | } | | } | | } | +------------------------------------------------------------------------------------------------+ ``` After dropping the index ```sql MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01'; Query OK, 0 rows affected Time: 0.364s MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01'; Query OK, 0 rows affected Time: 0.362s ``` query plan ```sql +------------------------------------------------------------------------------------------------+ | EXPLAIN | |------------------------------------------------------------------------------------------------| | { | | "query_block": { | | "select_id": 1, | | "table": { | | "delete": true, | | "table_name": "vm_stats", | | "access_type": "ALL", | | "rows_examined_per_scan": 660340, | | "filtered": "100.00", | | "attached_condition": "(`test`.`vm_stats`.`timestamp` < TIMESTAMP'1950-01-01 00:00:00')" | | } | | } | | } | +------------------------------------------------------------------------------------------------+ ``` As you can see, without the index it takes around 0.36 seconds because it has to go through each row in the database. It takes 0.001 seconds with the index because it doesn't need to go through all the rows in the table. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org