#35941: Add composite GenericForeignKey support
-------------------------------------+-------------------------------------
Reporter: Csirmaz Bendegúz | Owner: Csirmaz
| Bendegúz
Type: New feature | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by Csirmaz Bendegúz:
Old description:
> This is a follow up to #373 (''CompositePrimaryKey'').
>
> **Proposal:**
>
> My proposal is to implement ''GenericForeignKey'' support with **JSON**.
>
> 1. `object_id` is a `CharField` (or `TextField`)
> 2. ''CompositePrimaryKey'' is stored as a ''JSON array'' in `object_id`
> e.g. `object_id='[1, "abc"]'`
> 3. JOINs can be achieved with JSON functions (varies per db backend)
>
> **Joins:**
>
> After some experimentation with JSON functions, I believe the simplest
> solution is to construct JOINs like this:
>
> {{{
> JOIN ON ((object_id::jsonb)->>0)::integer = id
> AND ((object_id::jsonb)->>1)::uuid = uuid
> }}}
>
> Casting is a pain point, especially when joining on `DateTimeField`s, as
> we need to make sure the two columns are in the same format.
>
> **Risks:**
>
> What if someone is using a JSON array as the primary key (but it's not a
> composite primary key)?
> ''Before deserializing the JSON array, we need to check if the content
> type has a composite primary key or not.''
>
> What if the db backend doesn't support JSON functions?
> ''All supported databases support JSON functions.''
>
> JOIN on JSON functions is not efficient
> ''When storing a composite primary key in a `CharField` / `TextField`,
> JSON is the best option we have because it's widely supported by database
> backends. To achieve better performance, the composite pk shouldn't be
> stored in a `CharField` / `TextField` in the first place (see:
> alternatives).''
>
> **Notes:**
>
> 1. JOINs must work with Unicode characters
> 2. int, date, datetime, uuid, text fields must be supported
> 3. Django admin's `LogEntry` has its own implementation of "generic
> foreign keys". The approach we take with `GenericForeignKey` should also
> apply to `LogEntry`.
>
> **Alternatives:**
>
> It's possible to implement other strategies to deal with composite
> generic foreign keys.
>
> 1. ''JSONField'' - instead of `CharField` / `TextField`, we could make
> "object_id" a `JSONField`. This achieves the same thing but saves the
> `::jsonb` cast, so it's slightly faster.
> 2. ''GenericForeignKey with multiple `object_id`s'' - less flexible,
> can't store both regular primary keys and composite primary keys. Fastest
> JOINs.
>
> There's no reason we couldn't implement other strategies. I believe my
> proposal provides the most flexibility at the cost of performance. It
> should be implemented for backwards-compatibility, so systems using a
> `CharField` / `TextField` ''object_id'' (e.g. [https://github.com/django-
> guardian/django-
> guardian/blob/55beb9893310b243cbd6f578f9665c3e7c76bf96/guardian/models/models.py#L39
> django-guardian]) can easily integrate with composite primary keys. Other
> strategies can then be implemented to improve performance.
>
> Any feedback is appreciated!
New description:
This is a follow up to #373 (''CompositePrimaryKey'').
**Proposal:**
My proposal is to implement ''GenericForeignKey'' support with **JSON**.
1. `object_id` is a `CharField` (or `TextField`)
2. ''CompositePrimaryKey'' is stored as a ''JSON array'' in `object_id`
e.g. `object_id='[1, "abc"]'`
3. JOINs can be achieved with JSON functions (varies per db backend)
**Joins:**
After some experimentation with JSON functions, I believe the simplest
solution is to construct JOINs like this:
{{{
JOIN ON ((object_id::jsonb)->>0)::integer = id
AND ((object_id::jsonb)->>1)::uuid = uuid
}}}
Casting is a pain point, especially when joining on `DateTimeField`s, as
we need to make sure the two columns are in the same format.
**Risks:**
What if someone is using a JSON array as the primary key (but it's not a
composite primary key)?
''Before deserializing the JSON array, we need to check if the content
type has a composite primary key or not.''
What if the db backend doesn't support JSON functions?
''All supported databases support JSON functions. MariaDB has some
limitations handling surrogate pairs.''
JOIN on JSON functions is not efficient
''When storing a composite primary key in a `CharField` / `TextField`,
JSON is the best option we have because it's widely supported by database
backends. To achieve better performance, the composite pk shouldn't be
stored in a `CharField` / `TextField` in the first place (see:
alternatives).''
**Notes:**
1. JOINs must work with Unicode characters
2. int, date, datetime, uuid, text fields must be supported
3. Django admin's `LogEntry` has its own implementation of "generic
foreign keys". The approach we take with `GenericForeignKey` should also
apply to `LogEntry`.
**Alternatives:**
It's possible to implement other strategies to deal with composite generic
foreign keys.
1. ''JSONField'' - instead of `CharField` / `TextField`, we could make
"object_id" a `JSONField`. This achieves the same thing but saves the
`::jsonb` cast, so it's slightly faster.
2. ''GenericForeignKey with multiple `object_id`s'' - less flexible, can't
store both regular primary keys and composite primary keys. Fastest JOINs.
There's no reason we couldn't implement other strategies. I believe my
proposal provides the most flexibility at the cost of performance. It
should be implemented for backwards-compatibility, so systems using a
`CharField` / `TextField` ''object_id'' (e.g. [https://github.com/django-
guardian/django-
guardian/blob/55beb9893310b243cbd6f578f9665c3e7c76bf96/guardian/models/models.py#L39
django-guardian]) can easily integrate with composite primary keys. Other
strategies can then be implemented to improve performance.
Any feedback is appreciated!
--
--
Ticket URL: <https://code.djangoproject.com/ticket/35941#comment:13>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/django-updates/01070193b15de1c1-5fce4b8e-a1d7-4dcb-b7ba-32c5c3eb4ada-000000%40eu-central-1.amazonses.com.