Re: Admin list view counting

2017-08-09 Thread Adam Johnson
FYI Django-MySQL supplies an approx_count() method:
https://django-mysql.readthedocs.io/en/latest/queryset_extensions.html#django_mysql.models.approx_count
. It only works on QuerySets without filters. From what I understand it's
possible to get an estimate for filtered queries in MySQL in *some*
circumstances, but you need to be filtering on only one index and not
involve any joins.

I'm in favour of adding this (by whatever name) to core.

On 8 August 2017 at 23:12, Shai Berger  wrote:

> On Wednesday 09 August 2017 00:40:30 Josh Smeaton wrote:
> > We use the explain analyze method at work, but I don't think it's an
> > appropriate thing to include in core.
> >
>
> Agreed.
>
> > I'm not so sure about providing count estimates in core, but I don't
> fully
> > grok how the `estimate_above` would work in practise. Would that execute
> > two queries if the estimate returned a value below the threshold?
>
> That's what I had in mind. But two queries do not have to mean two database
> roundtrips: Since the estimation queries are backend-specific anyway, we
> might
> as well implement `table_count(estimate_above)` using blocks of procedural
> SQL.
>
> I am, again, not suggesting a change to count() at the ORM level -- this
> should be a new API; I *am* suggesting that the admin would use this new
> API
> where appropriate (when the admin queryset has no filters or joins which
> could
> affect the count).
>
> > I think
> > this could be tackled in a separate patch because it would then feed into
> > the expansion of the paginator.
> >
>
> Yes, the idea of controlling whether any count is done, while not entirely
> orthogonal, is certainly of value regardless of this idea, and can (and
> should) be dealt with separately.
>
> Shai.
>



-- 
Adam

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMyDDM3nPDnYfa6yH1uC4CrH_BGEtGGce-vQXqxb90m8%2BS67ug%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Admin list view counting

2017-08-08 Thread Shai Berger
On Wednesday 09 August 2017 00:40:30 Josh Smeaton wrote:
> We use the explain analyze method at work, but I don't think it's an
> appropriate thing to include in core.
> 

Agreed.

> I'm not so sure about providing count estimates in core, but I don't fully
> grok how the `estimate_above` would work in practise. Would that execute
> two queries if the estimate returned a value below the threshold?

That's what I had in mind. But two queries do not have to mean two database 
roundtrips: Since the estimation queries are backend-specific anyway, we might 
as well implement `table_count(estimate_above)` using blocks of procedural  
SQL.

I am, again, not suggesting a change to count() at the ORM level -- this 
should be a new API; I *am* suggesting that the admin would use this new API 
where appropriate (when the admin queryset has no filters or joins which could 
affect the count).

> I think
> this could be tackled in a separate patch because it would then feed into
> the expansion of the paginator.
> 

Yes, the idea of controlling whether any count is done, while not entirely 
orthogonal, is certainly of value regardless of this idea, and can (and 
should) be dealt with separately.

Shai.


Re: Admin list view counting

2017-08-08 Thread Josh Smeaton
We use the explain analyze method at work, but I don't think it's an 
appropriate thing to include in core.

I agree that these changes should be a function of the paginator and that 
being able to pass in a could also be very useful. 

I'm not so sure about providing count estimates in core, but I don't fully 
grok how the `estimate_above` would work in practise. Would that execute 
two queries if the estimate returned a value below the threshold? I think 
this could be tackled in a separate patch because it would then feed into 
the expansion of the paginator.

On Tuesday, 8 August 2017 07:46:19 UTC+10, Tom Forbes wrote:
>
> Mysql, Oracle and Postgres expose estimated rows in a pretty similar 
> manner (selecting the table name from a database-specific table). Perhaps a 
> more generic 'estimate_table_rows' function could be added for all 
> backends? I guess for sqlite it would have to use an actual count() as a 
> fallback. This could cause issues if people develop on sqlite and find the 
> function returns accurate row counts and this changes in production.
>
> The problem is these methods only return estimates for the entire table, 
> not including any filters, which narrows down the usefulness a lot. The 
> postgres wiki suggests 'scraping' the output of `EXPLAIN ANALYZE` which 
> sounds rather too hacky to add?
>
> On Mon, Aug 7, 2017 at 9:53 PM, Shai Berger  > wrote:
>
>> On PG we can and should do much better -- there is a way to get a very 
>> fast,
>> though not accurate, count of records in a table:
>>
>> https://wiki.postgresql.org/wiki/Count_estimate
>>
>> I think we should expose an API along the lines of
>>
>> queryset.table_count(estimate_above=2000)
>>
>> which actually performs the COUNT(*) only if the estimate is under the
>> threshold, at least where such facilities are available. We could use 
>> that by
>> default for the admin count queries, if the queryset has no filtering or
>> limiting joins. IMO, this alone would solve a whole lot of the problem; 
>> and
>> allowing it to be also used explicitly (e.g. by adding Tom's idea to 
>> allow the
>> user to specify the number of objects) would solve a whole lot of the 
>> rest of
>> the problem.
>>
>> On Monday 07 August 2017 15:03:15 Tom Forbes wrote:
>> > This is a great idea. A related issue I've come across with the 
>> paginator
>> > is not being able to pass an explicit count into it.
>> >
>> > If you have a query set with expensive annotations that don't effect the
>> > count in any way it's very wasteful to include them in the count SQL
>> > statement (on top of general PG count slowness), which Django does. 
>> Being
>> > able to pass in 'num_objects=x' would be great.
>> >
>> > On 7 Aug 2017 08:03, "Adam Johnson" > 
>> wrote:
>> >
>> > +1 from me, I've spent long enough on those COUNT(*) queries at work.
>> >
>> > Is it worth putting this logic in Paginator so applications can reuse 
>> it?
>> >
>> > On 7 August 2017 at 07:45, Josh Smeaton > > wrote:
>> > > The admin list page performs a count(*) (twice!) when viewing the list
>> > > page of a model. One of these counts can be disabled, but the 
>> pagination
>> > > count can not be. On huge tables, this can be a massive performance
>> > > issue. https://code.djangoproject.com/ticket/8408 was created to 
>> disable
>> > > the count, but it only allowed disabling one of them. I've reopened 
>> the
>> > > ticket with Matthew who has implemented the suggestion to disable
>> > > pagination features that require count.
>> > >
>> > > PR: https://github.com/django/django/pull/8858
>> > >
>> > > Does anyone have strong views one way or another here? Would like to 
>> hear
>> > > if so.
>> > >
>> > > --
>> > > You received this message because you are subscribed to the Google 
>> Groups
>> > > "Django developers (Contributions to Django itself)" group.
>> > > To unsubscribe from this group and stop receiving emails from it, 
>> send an
>> > > email to django-develop...@googlegroups.com .
>> > > To post to this group, send email to django-d...@googlegroups.com 
>> .
>> > > Visit this group at https://groups.google.com/group/django-developers
>> .
>> > > To view this discussion on the web visit 
>> https://groups.google.com/d/ms
>> > > gid/django-developers/d093b4dd-4bc4-4c16-910d-53c4740ec5ea%
>> > > 40googlegroups.com
>> > > <
>> https://groups.google.com/d/msgid/django-developers/d093b4dd-4bc4-4c16-9
>> > > 10d-53c4740ec5ea%
>> 40googlegroups.com?utm_medium=email&utm_source=footer> .
>> > > For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To

Re: Admin list view counting

2017-08-07 Thread Tom Forbes
Mysql, Oracle and Postgres expose estimated rows in a pretty similar manner
(selecting the table name from a database-specific table). Perhaps a more
generic 'estimate_table_rows' function could be added for all backends? I
guess for sqlite it would have to use an actual count() as a fallback. This
could cause issues if people develop on sqlite and find the function
returns accurate row counts and this changes in production.

The problem is these methods only return estimates for the entire table,
not including any filters, which narrows down the usefulness a lot. The
postgres wiki suggests 'scraping' the output of `EXPLAIN ANALYZE` which
sounds rather too hacky to add?

On Mon, Aug 7, 2017 at 9:53 PM, Shai Berger  wrote:

> On PG we can and should do much better -- there is a way to get a very
> fast,
> though not accurate, count of records in a table:
>
> https://wiki.postgresql.org/wiki/Count_estimate
>
> I think we should expose an API along the lines of
>
> queryset.table_count(estimate_above=2000)
>
> which actually performs the COUNT(*) only if the estimate is under the
> threshold, at least where such facilities are available. We could use that
> by
> default for the admin count queries, if the queryset has no filtering or
> limiting joins. IMO, this alone would solve a whole lot of the problem; and
> allowing it to be also used explicitly (e.g. by adding Tom's idea to allow
> the
> user to specify the number of objects) would solve a whole lot of the rest
> of
> the problem.
>
> On Monday 07 August 2017 15:03:15 Tom Forbes wrote:
> > This is a great idea. A related issue I've come across with the paginator
> > is not being able to pass an explicit count into it.
> >
> > If you have a query set with expensive annotations that don't effect the
> > count in any way it's very wasteful to include them in the count SQL
> > statement (on top of general PG count slowness), which Django does. Being
> > able to pass in 'num_objects=x' would be great.
> >
> > On 7 Aug 2017 08:03, "Adam Johnson"  wrote:
> >
> > +1 from me, I've spent long enough on those COUNT(*) queries at work.
> >
> > Is it worth putting this logic in Paginator so applications can reuse it?
> >
> > On 7 August 2017 at 07:45, Josh Smeaton  wrote:
> > > The admin list page performs a count(*) (twice!) when viewing the list
> > > page of a model. One of these counts can be disabled, but the
> pagination
> > > count can not be. On huge tables, this can be a massive performance
> > > issue. https://code.djangoproject.com/ticket/8408 was created to
> disable
> > > the count, but it only allowed disabling one of them. I've reopened the
> > > ticket with Matthew who has implemented the suggestion to disable
> > > pagination features that require count.
> > >
> > > PR: https://github.com/django/django/pull/8858
> > >
> > > Does anyone have strong views one way or another here? Would like to
> hear
> > > if so.
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Django developers (Contributions to Django itself)" group.
> > > To unsubscribe from this group and stop receiving emails from it, send
> an
> > > email to django-developers+unsubscr...@googlegroups.com.
> > > To post to this group, send email to django-developers@
> googlegroups.com.
> > > Visit this group at https://groups.google.com/group/django-developers.
> > > To view this discussion on the web visit
> https://groups.google.com/d/ms
> > > gid/django-developers/d093b4dd-4bc4-4c16-910d-53c4740ec5ea%
> > > 40googlegroups.com
> > >  d093b4dd-4bc4-4c16-9
> > > 10d-53c4740ec5ea%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> > > For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAFNZOJMytwWk1FYbQze1rbFTr0EnyadLBZmcpxtya4Ufst32xg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Admin list view counting

2017-08-07 Thread Shai Berger
On PG we can and should do much better -- there is a way to get a very fast, 
though not accurate, count of records in a table:

https://wiki.postgresql.org/wiki/Count_estimate

I think we should expose an API along the lines of

queryset.table_count(estimate_above=2000) 

which actually performs the COUNT(*) only if the estimate is under the 
threshold, at least where such facilities are available. We could use that by 
default for the admin count queries, if the queryset has no filtering or 
limiting joins. IMO, this alone would solve a whole lot of the problem; and 
allowing it to be also used explicitly (e.g. by adding Tom's idea to allow the 
user to specify the number of objects) would solve a whole lot of the rest of 
the problem.

On Monday 07 August 2017 15:03:15 Tom Forbes wrote:
> This is a great idea. A related issue I've come across with the paginator
> is not being able to pass an explicit count into it.
> 
> If you have a query set with expensive annotations that don't effect the
> count in any way it's very wasteful to include them in the count SQL
> statement (on top of general PG count slowness), which Django does. Being
> able to pass in 'num_objects=x' would be great.
> 
> On 7 Aug 2017 08:03, "Adam Johnson"  wrote:
> 
> +1 from me, I've spent long enough on those COUNT(*) queries at work.
> 
> Is it worth putting this logic in Paginator so applications can reuse it?
> 
> On 7 August 2017 at 07:45, Josh Smeaton  wrote:
> > The admin list page performs a count(*) (twice!) when viewing the list
> > page of a model. One of these counts can be disabled, but the pagination
> > count can not be. On huge tables, this can be a massive performance
> > issue. https://code.djangoproject.com/ticket/8408 was created to disable
> > the count, but it only allowed disabling one of them. I've reopened the
> > ticket with Matthew who has implemented the suggestion to disable
> > pagination features that require count.
> > 
> > PR: https://github.com/django/django/pull/8858
> > 
> > Does anyone have strong views one way or another here? Would like to hear
> > if so.
> > 
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Django developers (Contributions to Django itself)" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to django-developers+unsubscr...@googlegroups.com.
> > To post to this group, send email to django-developers@googlegroups.com.
> > Visit this group at https://groups.google.com/group/django-developers.
> > To view this discussion on the web visit https://groups.google.com/d/ms
> > gid/django-developers/d093b4dd-4bc4-4c16-910d-53c4740ec5ea%
> > 40googlegroups.com
> >  > 10d-53c4740ec5ea%40googlegroups.com?utm_medium=email&utm_source=footer> .
> > For more options, visit https://groups.google.com/d/optout.


Re: Admin list view counting

2017-08-07 Thread Tom Forbes
This is a great idea. A related issue I've come across with the paginator
is not being able to pass an explicit count into it.

If you have a query set with expensive annotations that don't effect the
count in any way it's very wasteful to include them in the count SQL
statement (on top of general PG count slowness), which Django does. Being
able to pass in 'num_objects=x' would be great.

On 7 Aug 2017 08:03, "Adam Johnson"  wrote:

+1 from me, I've spent long enough on those COUNT(*) queries at work.

Is it worth putting this logic in Paginator so applications can reuse it?

On 7 August 2017 at 07:45, Josh Smeaton  wrote:

> The admin list page performs a count(*) (twice!) when viewing the list
> page of a model. One of these counts can be disabled, but the pagination
> count can not be. On huge tables, this can be a massive performance issue.
> https://code.djangoproject.com/ticket/8408 was created to disable the
> count, but it only allowed disabling one of them. I've reopened the ticket
> with Matthew who has implemented the suggestion to disable pagination
> features that require count.
>
> PR: https://github.com/django/django/pull/8858
>
> Does anyone have strong views one way or another here? Would like to hear
> if so.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/django-developers/d093b4dd-4bc4-4c16-910d-53c4740ec5ea%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adam

-- 
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/
msgid/django-developers/CAMyDDM1Gs0vDhgGmMNFd29sftmvAd
53qibRo7DPUpLh%3DkwkZvw%40mail.gmail.com

.

For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAFNZOJN-pAX%3DDMvzYs6e2_Kc%3Dg7JW%2BBkOTPz-eACV6AWNGph%3Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Admin list view counting

2017-08-07 Thread Aymeric Augustin
Yes, the ability to disable all COUNT(*) is important (at least on Postgres).

-- 
Aymeric.



> On 7 Aug 2017, at 09:02, Adam Johnson  wrote:
> 
> +1 from me, I've spent long enough on those COUNT(*) queries at work.
> 
> Is it worth putting this logic in Paginator so applications can reuse it?
> 
> On 7 August 2017 at 07:45, Josh Smeaton  > wrote:
> The admin list page performs a count(*) (twice!) when viewing the list page 
> of a model. One of these counts can be disabled, but the pagination count can 
> not be. On huge tables, this can be a massive performance issue. 
> https://code.djangoproject.com/ticket/8408 
>  was created to disable the 
> count, but it only allowed disabling one of them. I've reopened the ticket 
> with Matthew who has implemented the suggestion to disable pagination 
> features that require count.
> 
> PR: https://github.com/django/django/pull/8858 
> 
> 
> Does anyone have strong views one way or another here? Would like to hear if 
> so.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com 
> .
> To post to this group, send email to django-developers@googlegroups.com 
> .
> Visit this group at https://groups.google.com/group/django-developers 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/d093b4dd-4bc4-4c16-910d-53c4740ec5ea%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .
> 
> 
> 
> -- 
> Adam
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com 
> .
> To post to this group, send email to django-developers@googlegroups.com 
> .
> Visit this group at https://groups.google.com/group/django-developers 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/CAMyDDM1Gs0vDhgGmMNFd29sftmvAd53qibRo7DPUpLh%3DkwkZvw%40mail.gmail.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/313DEA3B-C12E-48D5-84CA-53FC32D414C8%40polytechnique.org.
For more options, visit https://groups.google.com/d/optout.


Re: Admin list view counting

2017-08-07 Thread Adam Johnson
+1 from me, I've spent long enough on those COUNT(*) queries at work.

Is it worth putting this logic in Paginator so applications can reuse it?

On 7 August 2017 at 07:45, Josh Smeaton  wrote:

> The admin list page performs a count(*) (twice!) when viewing the list
> page of a model. One of these counts can be disabled, but the pagination
> count can not be. On huge tables, this can be a massive performance issue.
> https://code.djangoproject.com/ticket/8408 was created to disable the
> count, but it only allowed disabling one of them. I've reopened the ticket
> with Matthew who has implemented the suggestion to disable pagination
> features that require count.
>
> PR: https://github.com/django/django/pull/8858
>
> Does anyone have strong views one way or another here? Would like to hear
> if so.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/django-developers/d093b4dd-4bc4-4c16-910d-
> 53c4740ec5ea%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adam

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMyDDM1Gs0vDhgGmMNFd29sftmvAd53qibRo7DPUpLh%3DkwkZvw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Admin list view counting

2017-08-06 Thread Josh Smeaton
The admin list page performs a count(*) (twice!) when viewing the list page 
of a model. One of these counts can be disabled, but the pagination count 
can not be. On huge tables, this can be a massive performance 
issue. https://code.djangoproject.com/ticket/8408 was created to disable 
the count, but it only allowed disabling one of them. I've reopened the 
ticket with Matthew who has implemented the suggestion to disable 
pagination features that require count.

PR: https://github.com/django/django/pull/8858

Does anyone have strong views one way or another here? Would like to hear 
if so.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/d093b4dd-4bc4-4c16-910d-53c4740ec5ea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.