#29648: Incorrect escaping when using subqueries inside datetime truncation
functions
-------------------------------------+-------------------------------------
     Reporter:  Raphael Michel       |                    Owner:  Raphael
                                     |  Michel
         Type:  Bug                  |                   Status:  assigned
    Component:  Database layer       |                  Version:  2.1
  (models, ORM)                      |
     Severity:  Release blocker      |               Resolution:
     Keywords:  Subqueries Datetime  |             Triage Stage:
                                     |  Unreviewed
    Has patch:  1                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Description changed by Raphael Michel:

Old description:

> If you build a subquery that (a) has a datetime-typed result and (b)
> contains parameterized values, like this:
>
> {{{
> inner = Fan.objects.filter(
>     author=OuterRef('pk'),
>     name__in=('Emma', 'Isabella', 'Tom')
> ).values('author').annotate(newest_fan=Max('fan_since')).values('newest_fan')
> }}}
>
> And then use one of the Trunc* methods on the result, like this:
>
> {{{
> outer = Author.objects.annotate(
>     newest_fan_year=TruncYear(Subquery(inner,
> output_field=DateTimeField()))
> )
> }}}
>
> Then Django will throw an `OperationalError` or `TypeError`, because it
> internally builds an incorrect query:
>
> {{{
> SELECT "db_functions_author"."name", django_datetime_trunc('year',
> (SELECT MAX(U0."fan_since") AS "newest_fan" FROM "db_functions_fan" U0
> WHERE (U0."author_id" = ("db_functions_author"."id") AND U0."name" IN
> (%%s, %%s, %%s)) GROUP BY U0."author_id"), 'UTC') AS "newest_fan_year"
> FROM "db_functions_author" ORDER BY "db_functions_author"."name" ASC
> }}}
>
> As you can see, this query contains `%%s` instead of `%s` in place of the
> parameters in the subquery. The reason is that `TruncBase` makes an
> incorrect assumption about the database backends:
>
> {{{
> # Escape any params because trunc_sql will format the string.
> inner_sql = inner_sql.replace('%s', '%%s')
> }}}
>
> There's two things wrong with this:
>
> * This assumption is wrong, because no database backend in core runs
> string formatting on this value. Instead, they use this value as a
> parameter in string formatting of another string, like `sql = "TRUNC(%s,
> 'Q')" % field_name`.
> * A Transform should not make assumptions about implementation details of
> the database backend in the first place. Some backends, like the MySQL
> one, use `.format()` instead of `%` to format their strings, so the
> assumption doesn't apply at all.
>
> For reference, ``Extract`` (which is very similar to ``TruncBase`` in
> many regards) doesn't have this assumption. I therefore recommend to
> remove the ``replace`` statement without substitution to resolve the
> issue.
>
> I've prepared a patch with a regression test that I'll send as a PR as
> soon as I have the ticket number of this bug.

New description:

 If you build a subquery that (a) has a datetime-typed result and (b)
 contains parameterized values, like this:

 {{{
 inner = Fan.objects.filter(
     author=OuterRef('pk'),
     name__in=('Emma', 'Isabella', 'Tom')
 ).values('author').annotate(newest_fan=Max('fan_since')).values('newest_fan')
 }}}

 And then use one of the Trunc* methods on the result, like this:

 {{{
 outer = Author.objects.annotate(
     newest_fan_year=TruncYear(Subquery(inner,
 output_field=DateTimeField()))
 )
 }}}

 Then Django will throw an `OperationalError` or `TypeError`, because it
 internally builds an incorrect query:

 {{{
 SELECT "db_functions_author"."name", django_datetime_trunc('year', (SELECT
 MAX(U0."fan_since") AS "newest_fan" FROM "db_functions_fan" U0 WHERE
 (U0."author_id" = ("db_functions_author"."id") AND U0."name" IN (%%s, %%s,
 %%s)) GROUP BY U0."author_id"), 'UTC') AS "newest_fan_year" FROM
 "db_functions_author" ORDER BY "db_functions_author"."name" ASC
 }}}

 As you can see, this query contains `%%s` instead of `%s` in place of the
 parameters in the subquery. The reason is that `TruncBase` makes an
 incorrect assumption about the database backends:

 {{{
 # Escape any params because trunc_sql will format the string.
 inner_sql = inner_sql.replace('%s', '%%s')
 }}}

 There's two things wrong with this:

 * This assumption is wrong, because no database backend in core runs
 string formatting on this value. Instead, they use this value as a
 parameter in string formatting of another string, like `sql = "TRUNC(%s,
 'Q')" % field_name`.
 * A Transform should not make assumptions about implementation details of
 the database backend in the first place. Some backends, like the MySQL
 one, use `.format()` instead of `%` to format their strings, so the
 assumption doesn't apply at all.

 For reference, `Extract` (which is very similar to `TruncBase` in many
 regards) doesn't have this assumption. I therefore recommend to remove the
 `replace` statement without substitution to resolve the issue.

--

-- 
Ticket URL: <https://code.djangoproject.com/ticket/29648#comment:3>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To post to this group, send email to django-updates@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/066.2cea02dbd5952bf61d10c931abe39988%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to