Re: [PR] feat(core): SoftDeleteMixin and restore infrastructure [superset]

via GitHub Wed, 20 May 2026 11:36:03 -0700


mikebridge commented on PR #39977:
URL: https://github.com/apache/superset/pull/39977#issuecomment-4501477281


   @aminghadersohi thanks for the careful pass. Most items addressed in 
`06a64f02c1`; two deferred with reasoning below.
   
   ### HIGH
   
   **H1 — uncached subclass walk → cached registry via `__init_subclass__`. 
Fixed.**
   
   You're right that retrofitting under live traffic is harder than getting it 
right now. `SoftDeleteMixin` now maintains a sorted `_registered_subclasses: 
ClassVar[list[type[SoftDeleteMixin]]]` updated by `__init_subclass__` at 
class-definition time; `_all_soft_delete_subclasses()` is now a `return 
SoftDeleteMixin._registered_subclasses` — no walk, no `sorted()` allocation per 
call. Sort happens once per registration (rare event at app init), not per 
query.
   
   I chose the `__init_subclass__` registry over `functools.lru_cache` because 
the cache would freeze at first call — tests that define synthetic subclasses 
(we have ~6 of them) would never be picked up by the listener, breaking the 
test suite. The registry hook handles class-definition events naturally and 
works for both production and test contexts.
   
   **H2 — non-restorable / GDPR hook. Deferred.**
   
   The concern is valid but I'd push back on landing the interface now. Two 
reasons:
   
   1. **No concrete use case yet**: SIP-208 V1 doesn't include compliance 
flows. Designing a `deletion_policy` or `locked_at` interface without a real 
GDPR/audit/admin-purge workflow risks getting the API wrong — and once entity 
migrations land, the migration to a richer interface becomes the expensive 
change you're trying to avoid.
   2. **The natural retrofit isn't catastrophic**: if compliance becomes V2 
scope, the cheap path is a separate `LockedDeletion` table (UUID-keyed, points 
at the soft-deleted row) checked in `BaseRestoreCommand.validate()`. That's a 
one-migration add — no column on the mixin, no change to the existing entity 
migrations. The expense Amin is anticipating ("five entity rollouts already 
landed") doesn't actually accrue, because the lock lives outside the entity 
table.
   
   If GDPR work gets scheduled, I'll do the design then with the actual 
constraints in hand. Tracking as a follow-up rather than a speculative 
interface here.
   
   ### MEDIUM
   
   **M3 — `SoftDeleteMixin` inline import in `BaseDAO.delete`. Fixed.** Moved 
to top-level (no cycle); `SKIP_VISIBILITY_FILTER_CLASSES` from the same module 
was already top-level so adding `SoftDeleteMixin` to the existing import line 
was the right cleanup.
   
   **M4 — `raise_for_ownership` pylint-disable. Documented (kept inline).** 
Tried moving to top-level — it does cause a real circular: `security.manager` → 
`models.helpers` → `models.core` → lazy `superset.feature_flag_manager`. Kept 
the inline import but added a comment naming the specific cycle so a future 
maintainer doesn't try to "fix" it and re-trip the issue:
   
   ```python
   # Inline import: ``superset.models.helpers`` transitively imports
   # ``superset.models.core``, which depends on lazily-initialised
   # ``superset.feature_flag_manager``. A top-level import here would
   # create a circular dependency (security ↔ models.core ↔ superset).
   ```
   
   **M5 — `BaseDeletedStateFilter.model: Any` → typed. Fixed.**
   
   ```python
   # Subclasses bind ``model`` to a concrete ``SoftDeleteMixin``
   # subclass. Typed as ``type[SoftDeleteMixin]`` so a subclass that
   # accidentally binds to a non-soft-deletable entity fails mypy
   # rather than crashing at runtime on ``.deleted_at``.
   model: ClassVar[type[SoftDeleteMixin]]
   ```
   
   **M6 — listener placement. Documented.** Investigated and confirmed the 
placement is intentional — the `do_orm_execute` hook attaches to the `Session` 
*class* (process-wide), not a Session instance, so it has no Flask-app-context 
dependency. `setup_db()` runs earlier in `init_app` so the `Session` import is 
already initialised. Added a six-line comment in `init_app` explaining the 
placement so the rationale doesn't have to be re-derived from review history.
   
   **M7 — no tests for `BaseDeletedStateFilter` / `SoftDeleteApiMixin`. 
Fixed.** New `tests/unit_tests/views/test_soft_delete_filter.py` with 10 tests:
   
   - Filter: value-absent / include / only / case-insensitivity / unknown-value 
— including assertions that the session bypass set and the `g` augmentation 
flag are both correctly populated.
   - Mixin: no-op-without-flag / inject-with-flag / **consume-flag-on-call** 
(the read-and-clear semantics you specifically called out as subtle enough to 
warrant a dedicated test) / `_serialize_deleted_at` handling of `None` and 
`datetime`.
   
   **M8 — `deleted_at` schema mutation undocumented. Acknowledged.** This is 
intentional — `deleted_state=include` augments rows with `deleted_at`, and 
that's the point of the filter. I'll add a "ADDITIONAL INFORMATION" callout to 
each entity rollout PR body (#40128 / #40129 / #40130) noting the response 
schema additions, and call out the OpenAPI response-schema update as a 
per-entity-PR responsibility so it's explicit on the checklist. Not changing 
the infrastructure PR for this — the augmentation logic itself is correct.
   
   ### Nits
   
   - **Explicit `orig_resource is None` guard in `raise_for_ownership`. 
Fixed.** The fallthrough was correct in effect (`hasattr(None, "owners") → 
False → owners=[] → MISSING_OWNERSHIP_ERROR`) but the error reads "you don't 
own this" when the actual cause is "this row was hard-deleted between the 
caller's load and the re-query." Now raises the precise cause:
   
     ```python
     if orig_resource is None:
         raise SupersetSecurityException(...
             message=_("Resource was removed before ownership could be 
verified"),
         ...)
     ```
   
   - **Unit tests for `find_existing_for_import` / 
`clear_soft_deleted_for_import`. Fixed.** New 
`tests/unit_tests/commands/importers/v1/test_find_existing_for_import.py` (5 
tests) — pure-lookup behaviour for live + soft-deleted + missing rows, `clear` 
hard-deletes via the ORM (fires `after_delete` listeners), and the composed 
find-then-clear contract that's the documented caller sequence.
   
   - **Feature-flag kill switch (`ENABLE_SOFT_DELETE`)**: deliberately not 
adding. SIP-208 explicitly discusses this: a global flag risks confusing 
partial states (some entities filtered, others not, depending on which adopters 
have the mixin at the time the flag flips). The rollback path is "revert the 
entity rollout PR" rather than a runtime kill — the infrastructure is a no-op 
without an adopter, so a problem in production means a specific entity rollout 
PR misbehaves, and reverting that single PR is the right escape hatch.
   
   - **`Query.get()` → `session.get(Model, pk)` (SQLAlchemy 2.0 idiom)**: 
skipping. The pattern is pre-existing in `raise_for_ownership` and unrelated to 
soft-delete; doing it here would mix scopes. Worth a separate cleanup PR for 
the codebase once the SQLAlchemy 1.4 → 2.x migration is on the roadmap (the 
`Mapped[]` adoption thread already touches this area).
   
   - **`register_soft_delete_listener()` public function**: skipping. The 
inline `from superset.models.helpers import _add_soft_delete_filter` in 
`setup_soft_delete_listener` is one line in one place; the indirection doesn't 
pay for itself.
   
   ### Architecture notes
   
   - **Cascade decision per entity**: agreed, documenting as a per-entity 
checklist item. Updating the three entity rollout PR bodies (#40128 / #40129 / 
#40130) to call out the cascade-neutral choice for each (charts: dashboards 
keep their reference, render a placeholder; dashboards: child slices stay live; 
datasets: dependent charts render an error at chart-load).
   
   - **`owners` M2M constraint**: noted. Logging as a documented constraint on 
future soft-deletable additions — if `User` or `Role` were ever to become 
soft-deletable, `raise_for_ownership`'s `.owners` lookup would silently return 
empty for legitimate restores. Not a current risk (neither model is on the 
SIP-208 roadmap), but worth pinning in the mixin docstring as a future-proofing 
note. Will add in a follow-up tidy.
   
   ### Re: the praise
   
   Thanks. The dual-bypass mechanism in particular went through ~5 design 
iterations — per-request `g` flag → moved into `raise_for_ownership` → 
per-query `execution_options` → discovered FAB's outer/inner reconstruction 
strips them → session-scoped `session.info` → per-class scoping. The "design 
doc woven into the code" effect is partly because each iteration's reasoning 
*had* to land in docstrings to keep the next reviewer (and future-me) from 
re-deriving the same trade-offs.
   
   ---
   
   Infrastructure tip: `06a64f02c1`. Entity branches rebased: `sc-106189` → 
`a3ac63be0b`, `sc-106190` → `8bff0145b5`, `sc-106191` → `0a56e67842`. CI 
re-running across all four PRs.
   
   40 unit tests pass (was 26; added 14 covering filter, mixin, importer 
helpers). Let me know if anything else surfaces.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat(core): SoftDeleteMixin and restore infrastructure [superset]

Reply via email to