#36199: Redundant migrations generated with Coalesce Index in GinIndex with
SearchVector
-------------------------------------+-------------------------------------
     Reporter:  Ian Bicket           |                    Owner:  (none)
         Type:  Bug                  |                   Status:  new
    Component:  Database layer       |                  Version:  5.1
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:
     Keywords:  migrations, search,  |             Triage Stage:
  coalesce, index                    |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Description changed by Ian Bicket:

Old description:

> Title: Redundant Migration Generation for Coalesce Index with TextField
> Output in Django 5.1 (PostgreSQL)
>
> '''Django Version:''' 5.1
>
> '''Database Backend:''' PostgreSQL (version 15)
>
> '''Python Version:''' 3.12
>
> '''Operating System:''' Observed on MacOS Sequoia, but likely to
> generalize
>
> '''Description:'''
> When running makemigrations on a Django 5.1 project using a PostgreSQL
> backend, the migration autodetector redundantly generates a migration
> that removes and then recreates a Coalesce index where the output_field
> is set to TextField(). This occurs because the autodetector incorrectly
> registers the TextField in the existing and new application state as
> distinct, leading to continuous migration creation even when no schema
> changes have occurred.
>
> '''Steps to Reproduce:'''
>
> Create a Django model with an index using Coalesce and explicitly set
> output_field=TextField(), such as in the example below:
> {{{
> #!python
>
> from django.contrib.postgres.indexes import GinIndex
> from django.contrib.postgres.search import SearchVector
> from django.db.models import Value
> from django.db.models.functions import Coalesce
>
> # ... model definition, etc.
>
> class SummaryItem(models.Model):
>     class Meta:
>         indexes = [
>                 GinIndex(
>                     SearchVector(
>                         Coalesce(
>                             'item_text_by_user',
>                             'item_text',
>                             Value(''),
>                             output_field=TextField()
>                     ),
>                     'source',
>                     config='english',
>                 ),
>                 name='summary_item_search_index',
>             )
>         ]
>
>     created_at = models.DateTimeField(auto_now_add=True)
>     updated_at = models.DateTimeField(auto_now=True)
>     deleted_at = models.DateTimeField(null=True)
>     item_type = models.CharField(max_length=16, choices=ItemType)
>     item_text = models.TextField()
>     item_text_by_user = models.TextField(null=True)
> }}}
>
> Run makemigrations and apply the migration.
>
> Run makemigrations again without making any changes.
>
> Observe that a new migration is generated that removes and recreates the
> same Coalesce index.
>
> '''Expected Behavior:'''
> Once the initial migration has been applied, subsequent runs of
> makemigrations should not generate redundant migrations unless there is
> an actual schema change.
>
> '''Actual Behavior:'''
> Each execution of makemigrations generates an unnecessary migration
> removing and recreating the Coalesce index.
>
> '''Potential Cause:'''
> The migration autodetector may be treating the TextField instances in the
> existing and new application state as distinct, even though they are
> equivalent. This causes Django to incorrectly assume a change has
> occurred and generate a new migration.  It was observed while breaking
> down the deconstructed indices from the autodetector's from_state and
> to_state (see here:
> https://github.com/django/django/blob/0bf412111be686b6b23e00863f5d449d63557dbf/django/db/migrations/autodetector.py#L1347)
> that the deconstructed new and old indexes were identified as different
> because the `creation_counter` attributes of the TextField()  passed to
> `output_field` in the indexes (which in turn occurred in the ' in the two
> respective states were different, which is part of how the `__eq__`
> operator is defined for objects inheriting from `Field`, and therefore
> the index in the old and new states were identified as separate, even
> though there was no change in the definition of the index.
>
> '''Workaround:'''
> As a temporary workaround, removing output_field=TextField() and the
> Value('') (which defaults to a CharField rendering the output_field
> necessary) parameter from Coalesce will prevent redundant migrations if
> Django can infer the correct type automatically. However, this might not
> be suitable in all cases.

New description:

 Title: Redundant Migration Generation for Coalesce Index with TextField
 Output in Django 5.1 (PostgreSQL)

 '''Django Version:''' 5.1

 '''Database Backend:''' PostgreSQL (version 15)

 '''Python Version:''' 3.12

 '''Operating System:''' Observed on MacOS Sequoia, but likely to
 generalize

 '''Description:'''
 When running makemigrations on a Django 5.1 project using a PostgreSQL
 backend, the migration autodetector redundantly generates a migration that
 removes and then recreates a GinIndex with a SearchVector over a Coalesced
 field where the output_field parameter is set to TextField(). Furthermore,
 makemigrations can be run an arbitrary number of times, each execution re-
 creating what is essentially the same migration, removing and recreating
 the index.

 '''Steps to Reproduce:'''

 Create a Django model with an index using Coalesce and explicitly set
 output_field=TextField(), such as in the example below:
 {{{
 #!python

 from django.contrib.postgres.indexes import GinIndex
 from django.contrib.postgres.search import SearchVector
 from django.db.models import Value
 from django.db.models.functions import Coalesce

 # ... model definition, etc.

 class SummaryItem(models.Model):
     class Meta:
         indexes = [
                 GinIndex(
                     SearchVector(
                         Coalesce(
                             'item_text_by_user',
                             'item_text',
                             Value(''),
                             output_field=TextField()
                     ),
                     'source',
                     config='english',
                 ),
                 name='summary_item_search_index',
             )
         ]

     created_at = models.DateTimeField(auto_now_add=True)
     updated_at = models.DateTimeField(auto_now=True)
     deleted_at = models.DateTimeField(null=True)
     item_type = models.CharField(max_length=16, choices=ItemType)
     item_text = models.TextField()
     item_text_by_user = models.TextField(null=True)
 }}}

 Run makemigrations and apply the migration.

 Run makemigrations again without making any changes.

 Observe that a new migration is generated that removes and recreates the
 same Coalesce index.

 '''Expected Behavior:'''
 Once the initial migration has been applied, subsequent runs of
 makemigrations should not generate redundant migrations unless there is an
 actual schema change.

 '''Actual Behavior:'''
 Each execution of makemigrations generates an unnecessary migration
 removing and recreating the Coalesce index.

 '''Potential Cause:'''
 The migration autodetector may be treating the TextField instances in the
 existing and new application state as distinct, even though they are
 equivalent. This causes Django to incorrectly assume a change has occurred
 and generate a new migration.  It was observed while breaking down the
 deconstructed indices from the autodetector's from_state and to_state (see
 here:
 
https://github.com/django/django/blob/0bf412111be686b6b23e00863f5d449d63557dbf/django/db/migrations/autodetector.py#L1347)
 that the deconstructed new and old indexes were identified as different
 because the `creation_counter` attributes of the TextField()  passed to
 `output_field` in the indexes (which in turn occurred in the ' in the two
 respective states were different, which is part of how the `__eq__`
 operator is defined for objects inheriting from `Field`, and therefore the
 index in the old and new states were identified as separate, even though
 there was no change in the definition of the index.

 '''Workaround:'''
 As a temporary workaround, removing output_field=TextField() and the
 Value('') (which defaults to a CharField rendering the output_field
 necessary) parameter from Coalesce will prevent redundant migrations if
 Django can infer the correct type automatically. However, this might not
 be suitable in all cases.

--
-- 
Ticket URL: <https://code.djangoproject.com/ticket/36199#comment:3>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/010701951b3689c6-f1aa2814-6f95-4b27-893a-064692d87d81-000000%40eu-central-1.amazonses.com.

Reply via email to