#35941: Add composite GenericForeignKey support
-------------------------------------+-------------------------------------
     Reporter:  Csirmaz Bendegúz     |                    Owner:  Csirmaz
                                     |  Bendegúz
         Type:  New feature          |                   Status:  assigned
    Component:  Database layer       |                  Version:  dev
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:  Accepted
    Has patch:  1                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Description changed by Csirmaz Bendegúz:

Old description:

> This is a follow up to #373 (''CompositePrimaryKey'').
>
> **Proposal:**
>
> My proposal is to implement ''GenericForeignKey'' support with **JSON**.
>
> 1. `object_id` is a `CharField` (or `TextField`)
> 2. ''CompositePrimaryKey'' is stored as a ''JSON array'' in `object_id`
> e.g. `object_id='[1, "abc"]'`
> 3. JOINs can be achieved with JSON functions (varies per db backend)
>
> **Joins:**
>
> After some experimentation with JSON functions, I believe the simplest
> solution is to construct JOINs like this:
>
> {{{
> JOIN ON ((object_id::jsonb)->>0)::integer = id
>     AND ((object_id::jsonb)->>1)::uuid = uuid
> }}}
>
> Casting is a pain point, especially when joining on `DateTimeField`s, as
> we need to make sure the two columns are in the same format.
>
> **Risks:**
>
> What if someone is using a JSON array as the primary key (but it's not a
> composite primary key)?
> ''Before deserializing the JSON array, we need to check if the content
> type has a composite primary key or not.''
>
> What if the db backend doesn't support JSON functions?
> ''All supported databases support JSON functions.''
>
> JOIN on JSON functions is not efficient
> ''When storing a composite primary key in a `CharField` / `TextField`,
> JSON is the best option we have because it's widely supported by database
> backends. To achieve better performance, the composite pk shouldn't be
> stored in a `CharField` / `TextField` in the first place (see:
> alternatives).''
>
> **Notes:**
>
> 1. JOINs must work with Unicode characters
> 2. int, date, datetime, uuid, text fields must be supported
> 3. Django admin's `LogEntry` has its own implementation of "generic
> foreign keys". The approach we take with `GenericForeignKey` should also
> apply to `LogEntry`.
>
> **Alternatives:**
>
> It's possible to implement other strategies to deal with composite
> generic foreign keys.
>
> 1. ''JSONField'' - instead of `CharField` / `TextField`, we could make
> "object_id" a `JSONField`. This achieves the same thing but saves the
> `::jsonb` cast, so it's slightly faster.
> 2. ''GenericForeignKey with multiple `object_id`s'' - less flexible,
> can't store both regular primary keys and composite primary keys. Fastest
> JOINs.
>
> There's no reason we couldn't implement other strategies. I believe my
> proposal provides the most flexibility at the cost of performance. It
> should be implemented for backwards-compatibility, so systems using a
> `CharField` / `TextField` ''object_id'' (e.g. [https://github.com/django-
> guardian/django-
> guardian/blob/55beb9893310b243cbd6f578f9665c3e7c76bf96/guardian/models/models.py#L39
> django-guardian]) can easily integrate with composite primary keys. Other
> strategies can then be implemented to improve performance.
>
> Any feedback is appreciated!

New description:

 This is a follow up to #373 (''CompositePrimaryKey'').

 **Proposal:**

 My proposal is to implement ''GenericForeignKey'' support with **JSON**.

 1. `object_id` is a `CharField` (or `TextField`)
 2. ''CompositePrimaryKey'' is stored as a ''JSON array'' in `object_id`
 e.g. `object_id='[1, "abc"]'`
 3. JOINs can be achieved with JSON functions (varies per db backend)

 **Joins:**

 After some experimentation with JSON functions, I believe the simplest
 solution is to construct JOINs like this:

 {{{
 JOIN ON ((object_id::jsonb)->>0)::integer = id
     AND ((object_id::jsonb)->>1)::uuid = uuid
 }}}

 Casting is a pain point, especially when joining on `DateTimeField`s, as
 we need to make sure the two columns are in the same format.

 **Risks:**

 What if someone is using a JSON array as the primary key (but it's not a
 composite primary key)?
 ''Before deserializing the JSON array, we need to check if the content
 type has a composite primary key or not.''

 What if the db backend doesn't support JSON functions?
 ''All supported databases support JSON functions. MariaDB has some
 limitations handling surrogate pairs.''

 JOIN on JSON functions is not efficient
 ''When storing a composite primary key in a `CharField` / `TextField`,
 JSON is the best option we have because it's widely supported by database
 backends. To achieve better performance, the composite pk shouldn't be
 stored in a `CharField` / `TextField` in the first place (see:
 alternatives).''

 **Notes:**

 1. JOINs must work with Unicode characters
 2. int, date, datetime, uuid, text fields must be supported
 3. Django admin's `LogEntry` has its own implementation of "generic
 foreign keys". The approach we take with `GenericForeignKey` should also
 apply to `LogEntry`.

 **Alternatives:**

 It's possible to implement other strategies to deal with composite generic
 foreign keys.

 1. ''JSONField'' - instead of `CharField` / `TextField`, we could make
 "object_id" a `JSONField`. This achieves the same thing but saves the
 `::jsonb` cast, so it's slightly faster.
 2. ''GenericForeignKey with multiple `object_id`s'' - less flexible, can't
 store both regular primary keys and composite primary keys. Fastest JOINs.

 There's no reason we couldn't implement other strategies. I believe my
 proposal provides the most flexibility at the cost of performance. It
 should be implemented for backwards-compatibility, so systems using a
 `CharField` / `TextField` ''object_id'' (e.g. [https://github.com/django-
 guardian/django-
 
guardian/blob/55beb9893310b243cbd6f578f9665c3e7c76bf96/guardian/models/models.py#L39
 django-guardian]) can easily integrate with composite primary keys. Other
 strategies can then be implemented to improve performance.

 Any feedback is appreciated!

--
-- 
Ticket URL: <https://code.djangoproject.com/ticket/35941#comment:13>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/01070193b15de1c1-5fce4b8e-a1d7-4dcb-b7ba-32c5c3eb4ada-000000%40eu-central-1.amazonses.com.

Reply via email to