#30130: Django .values().distinct() returns a lot more records than
.values().distinct().count()
-------------------------------------+-------------------------------------
Reporter: James Lin | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 2.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by James Lin:
Old description:
> I have a table virtualmachineresources, which has 100k+ rows, it has
> columns `machine` and `cluster`, some rows the cluster field is empty. it
> has repeating rows of machine + with/without cluster, hence I want to use
> the distinct() method.
>
> using .values().distinct().count(), it return 2k rows
>
> {{{
> In [6]: VirtualMachineResources.objects.all().values('machine',
> 'cluster')
> ...: .distinct().count()
> Out[6]: 2247
> }}}
>
> When I loop through the distinct query
>
> {{{
> for resource in VirtualMachineResources.objects.all().values('machine',
> 'cluster').distinct():
> print(resource['machine'], resource['cluster'])
> }}}
>
> I observed it return 100k rows, with repeating rows that the same
> 'machine` with/without the cluster.
>
> Here is the corresponding stackoverflow question
> https://stackoverflow.com/questions/54354462/django-distinct-returns-
> more-records-than-count
New description:
I have a table virtualmachineresources, which has 100k+ rows, it has
columns `machine` and `cluster`, some rows the cluster field is empty. it
has repeating rows of machine + with/without cluster, hence I want to use
the distinct() method.
using .values().distinct().count(), it returned 2k rows
{{{
In [6]: VirtualMachineResources.objects.all().values('machine', 'cluster')
...: .distinct().count()
Out[6]: 2247
}}}
When I loop through the distinct query
{{{
for resource in VirtualMachineResources.objects.all().values('machine',
'cluster').distinct():
print(resource['machine'], resource['cluster'])
}}}
I observed it returned 100k rows, with repeating rows that the same
'machine` with/without the cluster.
Here is the corresponding stackoverflow question
https://stackoverflow.com/questions/54354462/django-distinct-returns-more-
records-than-count
--
--
Ticket URL: <https://code.djangoproject.com/ticket/30130#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/066.56932a55c8a5a178be1bab2b451607fb%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.