#34393: A filter query returns more items than the original queryset provides after applying INNER JOIN -------------------------------------+------------------------------------- Reporter: Ľuboš | Owner: nobody Mjachky | Type: | Status: new Uncategorized | Component: Database | Version: 3.2 layer (models, ORM) | Keywords: filter query Severity: Normal | duplicate distinct Triage Stage: | Has patch: 0 Unreviewed | Needs documentation: 0 | Needs tests: 0 Patch needs improvement: 0 | Easy pickings: 0 UI/UX: 0 | -------------------------------------+------------------------------------- In our project, we identified that the filter query returns more entries than the number of entries stored in the initial queryset.
The following piece of code is involved: {{{ # qs.count() == 4 scoped_repos = repo_viewset.get_queryset().values_list("pk", flat=True) filtered_content = qs.filter(repositories__in=scoped_repos) # filtered_content.count() == 8 }}} The generated query: {{{ SELECT * FROM "rpm_package" INNER JOIN "core_content" ON ("rpm_package"."content_ptr_id" = "core_content"."pulp_id") INNER JOIN "core_repositorycontent" ON ("core_content"."pulp_id" = "core_repositorycontent"."content_id") WHERE "core_repositorycontent"."repository_id" IN (c35b7039-2c2c-48e3 -8f4f-b0eeabad8af1, ee39a78b-9dd5-4bdf-85d9-eb6406b6ef49) }}} One of the things being noticed is that the query is constructed with an INNER JOIN clause instead of a LEFT JOIN clause. The core_repositorycontent table contains a lot of duplicates. We believe that this should not be a problem. Adding the distinct() query at the end of the call resolves the issue. See https://github.com/pulp/pulpcore/pull/3642. The question is whether this is a bug in Django (i.e., a filter query can return more elements than there are in the original queryset) or on our side, and we should restructure the query in a specific way. Any advice is welcome. -- Ticket URL: <https://code.djangoproject.com/ticket/34393> Django <https://code.djangoproject.com/> The Web framework for perfectionists with deadlines. -- You received this message because you are subscribed to the Google Groups "Django updates" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-updates+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/01070186c2cec158-38982f63-7580-4ee8-99ee-fc2ebf8e9136-000000%40eu-central-1.amazonses.com.