#30130: Django .values().distinct() returns a lot more records than
.values().distinct().count()
-------------------------------------+-------------------------------------
     Reporter:  James Lin            |                    Owner:  nobody
         Type:  Bug                  |                   Status:  new
    Component:  Database layer       |                  Version:  2.1
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Description changed by James Lin:

Old description:

> I have a table virtualmachineresources, which has 100k+ rows, it has
> columns 'machine` and `cluster`, some rows the cluster field is empty. it
> has repeating rows of machine + with/without cluster, hence I want to use
> the distinct() method.
>
> using .values().distinct().count(), it return 2k rows
>

> {{{
> In [6]: VirtualMachineResources.objects.all().values('machine',
> 'cluster')
>    ...: .distinct().count()
> Out[6]: 2247
> }}}
>

> When I loop through the distinct query
>
> {{{
> for resource in VirtualMachineResources.objects.all().values('machine',
> 'cluster').distinct():
>     print(resource['machine'], resource['cluster'])
> }}}
>

> I observed it return 100k rows, with repeating rows that the same
> 'machine` with/without the cluster.
>
> Here is the corresponding stackoverflow question
> https://stackoverflow.com/questions/54354462/django-distinct-returns-
> more-records-than-count

New description:

 I have a table virtualmachineresources, which has 100k+ rows, it has
 columns `machine` and `cluster`, some rows the cluster field is empty. it
 has repeating rows of machine + with/without cluster, hence I want to use
 the distinct() method.

 using .values().distinct().count(), it return 2k rows


 {{{
 In [6]: VirtualMachineResources.objects.all().values('machine', 'cluster')
    ...: .distinct().count()
 Out[6]: 2247
 }}}


 When I loop through the distinct query

 {{{
 for resource in VirtualMachineResources.objects.all().values('machine',
 'cluster').distinct():
     print(resource['machine'], resource['cluster'])
 }}}


 I observed it return 100k rows, with repeating rows that the same
 'machine` with/without the cluster.

 Here is the corresponding stackoverflow question
 https://stackoverflow.com/questions/54354462/django-distinct-returns-more-
 records-than-count

--

-- 
Ticket URL: <https://code.djangoproject.com/ticket/30130#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/066.d7036eae8e6b2580ce61b9cdd21f1219%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to