#30533: Delete cascade can break postgresql database: too large 'DELETE FROM a WHERE a.b_id IN (...)' -------------------------------------+------------------------------------- Reporter: Thomas | Owner: nobody Riccardi | Type: Bug | Status: new Component: Database | Version: 2.0 layer (models, ORM) | Keywords: queryset delete Severity: Normal | cascade batch bulk_batch_size Triage Stage: | postgresql crash Unreviewed | Has patch: 0 Needs documentation: 0 | Needs tests: 0 Patch needs improvement: 0 | Easy pickings: 0 UI/UX: 0 | -------------------------------------+------------------------------------- == Scenario - Model A has a foreign key to model B with on_delete=CASCADE - many B objects - execute `B.objects.delete()` Expected result: - deletion works Actual result: - when there are many B objects to delete (in our case it was 14M rows), the database process gets killed before completion: {{{ 2019-05-28 19:17:42.237 CEST [1]: [2-1] db=,user= LOG: server process (PID 12911) was terminated by signal 9: Killed 2019-05-28 19:17:42.238 CEST [1]: [3-1] db=,user= DETAIL: Failed process was running: DELETE FROM "a" WHERE "a"."b_id" IN (17271, 17272, 17273, 17274, 17275, <truncated enormous list> }}} - with a smaller database it worked (2M rows)
== Analysis It seems the related objects A get deleted in one query with an unbound `IN (...)` list of B objects. In fact this pattern already lead to an issue with the sqlite backend ([#link0]: sqlite supports only 999 parameters per query) This was fixed in django 1.8 by adding batch query [#link1], [#link2] with a size specified per backend: - sqlite: 999 [#link3] - oracle: 64k [#link4] - all others (base): not limited [#link5] == Workaround As a temporary workaround we monkey patched the `connection` instance own `bulk_batch_size` and limit to 64k. {{{#!python import types def monkey_patch_connection_bulk_batch_size(connection): def limited_bulk_batch_size(self, fields, objs): """ PostgreSQL can crash with too many parameters in a query e.g. 'DELETE FROM x WHERE x.y IN (...large list...)' => limit to 64k """ return 2**16 connection.ops.bulk_batch_size = types.MethodType(limited_bulk_batch_size, connection.ops) }}} It worked great in our case: we used it in a migration. workaround limitations: - no idea where to monkey patch for global usage - no idea how to choose the bulk size value - didn't handle the more complex computation using `fields` and `objs` parameters (Related remark: this is for deleting the related objects, then for the main objects deletion django already uses batch deletion, but with much smaller batch size: `GET_ITERATOR_CHUNK_SIZE = 100`; with some latency to the database it's a really small value (0.2ms per query)) == Context - postgresql 9.6 (GCP Cloud SQL) - django 2.0.13 (but related code has not changed in current master it seems) --- link 0 [=#link0] https://code.djangoproject.com/ticket/16426 link 1 [=#link1] https://github.com/django/django/blob/f6075fb333bae29ee213b050e91eaadef75496dd/django/db/models/deletion.py#L224-L225 link 2 [=#link2] https://github.com/django/django/blob/f6075fb333bae29ee213b050e91eaadef75496dd/django/db/models/deletion.py#L167 link 3 [=#link3] https://github.com/django/django/blob/f6075fb333bae29ee213b050e91eaadef75496dd/django/db/backends/sqlite3/operations.py#L27-L40 link 4 [=#link4] https://github.com/django/django/blob/f6075fb333bae29ee213b050e91eaadef75496dd/django/db/backends/oracle/operations.py#L592-L596 link 5 [=#link5] https://github.com/django/django/blob/f6075fb333bae29ee213b050e91eaadef75496dd/django/db/backends/base/operations.py#L65-L71 -- Ticket URL: <https://code.djangoproject.com/ticket/30533> Django <https://code.djangoproject.com/> The Web framework for perfectionists with deadlines. -- You received this message because you are subscribed to the Google Groups "Django updates" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-updates+unsubscr...@googlegroups.com. To post to this group, send email to django-updates@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/058.ad44325afe87c8e8d1fc6425228fa695%40djangoproject.com. For more options, visit https://groups.google.com/d/optout.