#29648: Incorrect escaping when using subqueries inside datetime truncation functions -------------------------------------+------------------------------------- Reporter: Raphael Michel | Owner: Raphael | Michel Type: Bug | Status: assigned Component: Database layer | Version: 2.1 (models, ORM) | Severity: Release blocker | Resolution: Keywords: Subqueries Datetime | Triage Stage: | Unreviewed Has patch: 1 | Needs documentation: 0 Needs tests: 0 | Patch needs improvement: 0 Easy pickings: 0 | UI/UX: 0 -------------------------------------+------------------------------------- Description changed by Raphael Michel:
Old description: > If you build a subquery that (a) has a datetime-typed result and (b) > contains parameterized values, like this: > > {{{ > inner = Fan.objects.filter( > author=OuterRef('pk'), > name__in=('Emma', 'Isabella', 'Tom') > ).values('author').annotate(newest_fan=Max('fan_since')).values('newest_fan') > }}} > > And then use one of the Trunc* methods on the result, like this: > > {{{ > outer = Author.objects.annotate( > newest_fan_year=TruncYear(Subquery(inner, > output_field=DateTimeField())) > ) > }}} > > Then Django will throw an `OperationalError` or `TypeError`, because it > internally builds an incorrect query: > > {{{ > SELECT "db_functions_author"."name", django_datetime_trunc('year', > (SELECT MAX(U0."fan_since") AS "newest_fan" FROM "db_functions_fan" U0 > WHERE (U0."author_id" = ("db_functions_author"."id") AND U0."name" IN > (%%s, %%s, %%s)) GROUP BY U0."author_id"), 'UTC') AS "newest_fan_year" > FROM "db_functions_author" ORDER BY "db_functions_author"."name" ASC > }}} > > As you can see, this query contains `%%s` instead of `%s` in place of the > parameters in the subquery. The reason is that `TruncBase` makes an > incorrect assumption about the database backends: > > {{{ > # Escape any params because trunc_sql will format the string. > inner_sql = inner_sql.replace('%s', '%%s') > }}} > > There's two things wrong with this: > > * This assumption is wrong, because no database backend in core runs > string formatting on this value. Instead, they use this value as a > parameter in string formatting of another string, like `sql = "TRUNC(%s, > 'Q')" % field_name`. > * A Transform should not make assumptions about implementation details of > the database backend in the first place. Some backends, like the MySQL > one, use `.format()` instead of `%` to format their strings, so the > assumption doesn't apply at all. > > For reference, ``Extract`` (which is very similar to ``TruncBase`` in > many regards) doesn't have this assumption. I therefore recommend to > remove the ``replace`` statement without substitution to resolve the > issue. > > I've prepared a patch with a regression test that I'll send as a PR as > soon as I have the ticket number of this bug. New description: If you build a subquery that (a) has a datetime-typed result and (b) contains parameterized values, like this: {{{ inner = Fan.objects.filter( author=OuterRef('pk'), name__in=('Emma', 'Isabella', 'Tom') ).values('author').annotate(newest_fan=Max('fan_since')).values('newest_fan') }}} And then use one of the Trunc* methods on the result, like this: {{{ outer = Author.objects.annotate( newest_fan_year=TruncYear(Subquery(inner, output_field=DateTimeField())) ) }}} Then Django will throw an `OperationalError` or `TypeError`, because it internally builds an incorrect query: {{{ SELECT "db_functions_author"."name", django_datetime_trunc('year', (SELECT MAX(U0."fan_since") AS "newest_fan" FROM "db_functions_fan" U0 WHERE (U0."author_id" = ("db_functions_author"."id") AND U0."name" IN (%%s, %%s, %%s)) GROUP BY U0."author_id"), 'UTC') AS "newest_fan_year" FROM "db_functions_author" ORDER BY "db_functions_author"."name" ASC }}} As you can see, this query contains `%%s` instead of `%s` in place of the parameters in the subquery. The reason is that `TruncBase` makes an incorrect assumption about the database backends: {{{ # Escape any params because trunc_sql will format the string. inner_sql = inner_sql.replace('%s', '%%s') }}} There's two things wrong with this: * This assumption is wrong, because no database backend in core runs string formatting on this value. Instead, they use this value as a parameter in string formatting of another string, like `sql = "TRUNC(%s, 'Q')" % field_name`. * A Transform should not make assumptions about implementation details of the database backend in the first place. Some backends, like the MySQL one, use `.format()` instead of `%` to format their strings, so the assumption doesn't apply at all. For reference, `Extract` (which is very similar to `TruncBase` in many regards) doesn't have this assumption. I therefore recommend to remove the `replace` statement without substitution to resolve the issue. -- -- Ticket URL: <https://code.djangoproject.com/ticket/29648#comment:3> Django <https://code.djangoproject.com/> The Web framework for perfectionists with deadlines. -- You received this message because you are subscribed to the Google Groups "Django updates" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-updates+unsubscr...@googlegroups.com. To post to this group, send email to django-updates@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/066.2cea02dbd5952bf61d10c931abe39988%40djangoproject.com. For more options, visit https://groups.google.com/d/optout.