Re: Improve queries on django admin

2016-07-22 Thread Tobias McNulty
I think the question was more about memory usage than speed, but the answer
is the same, in my opinion.

The only thing I'll add is that -- if returning a mere 100 rows of a table
is really causing memory issues -- there are alternatives to storing large
blobs of data directly in the DB (e.g., files).

Tobias

On Fri, Jul 22, 2016 at 6:22 PM, Shai Berger  wrote:

> I tend to agree with Tim -- in particular, a query on the admin should only
> return a small (<100) number of records, due to paging; if, for that size
> of
> query, you see a significant difference between returning all the columns
> and
> returning just the ones you need, it is suspicious: Either you have a very
> special case (e.g. it just happens that all the fields you chose to list
> are in
> an index, and so are selected extremely fast), or you're doing something
> wrong
> (which causes your default query to be very slow).
>
> For most cases, this should not be a significant optimization IMO.
>
> On Friday 22 July 2016 20:38:40 Tim Graham wrote:
> > I'm a bit wary of the complexity this would add, especially given this
> > warning in the documentation:
> >
> > The defer() method (and its cousin, only()
> > <
> https://docs.djangoproject.com/en/1.9/ref/models/querysets/#django.db.mode
> > ls.query.QuerySet.only>, below) are only for advanced use-cases. They
> > provide an optimization for when you have analyzed your queries closely
> > and understand *exactly* what information you need and have measured that
> > the difference between returning the fields you need and the full set of
> > fields for the model will be significant.
> >
> > I think it's better if developers override ModelAdmin.get_queryset() as
> > needed, as you've done. In particular, I thought of the case where a
> method
> > is used in list_display. In that case, I believe it's impossible to do
> the
> > optimization automatically since Django can't automatically determine
> what
> > fields a method might use.
> >
> > On Friday, July 22, 2016 at 9:51:27 AM UTC-4, Rael Max wrote:
> > > Hi Lucas, thanks for reply
> > >
> > > I think that select_related gives a great improve on performance but we
> > > can improve his usage passing the columns that we want retrieve,
> avoiding
> > > getting most columns/data and allocate more memory than necessary.
> > >
> > > Em quinta-feira, 21 de julho de 2016 17:01:00 UTC-3, Lucas Magnum
> escreveu:
> > >> You can use `list_select_related` for Django Admin too.
> > >>
> > >>
> > >>
> > >>
> > >> []'s
> > >>
> > >> Lucas Magnum.
> > >>
> > >> 2016-07-21 15:52 GMT-03:00 Rael Max :
> > >>> Hi everyone,
> > >>>
> > >>> I'm working in a project with a large mysql database and i've faced
> > >>> with problems generated on django admin list. Basically, the query
> > >>> executed to retrieve a list of items from a model uses a SQL SELECT
> > >>> passing a list of all attributes of model, but usually we only use a
> > >>> small set of them on *list_display* attribute.
> > >>>
> > >>> I solved this problem overriding the *queryset* method of
> *ModelAdmin*
> > >>> and using the method only of *QuerySet* using the fields listed on
> > >>> *list_display* attribute of *ModelAdmin*. With the limit of columns
> > >>> retrieved this queries should to consume less memory to be executed.
> > >>>
> > >>> Searching about this here and on django issue tracker i've not found
> > >>> nothing about. What you think about this optimization be the default
> > >>> behavior or use a *ModelAdmin* attribute to enable?
> > >>>
> > >>> Regards,
> > >>> Rael
>



-- 


*Tobias McNulty*Chief Executive Officer

tob...@caktusgroup.com
www.caktusgroup.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMGFDKR4D9cCUAwTvdqStWm0Rv3AYbCRU4S7JFuoGY4wxdK%2Bhg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: GitHub PR status within Trac

2016-07-22 Thread Carl Meyer
Hi Tobias,

On 07/22/2016 04:53 PM, Tobias McNulty wrote:
> I spent some time during the DjangoCon sprint today looking into
> dashboard.djangoproject.com  and how
> it calculates metrics. I was hoping to add some new metrics that mash up
> data from GitHub and Trac together. While technically possible, this
> breaks down when you want to link out to a list of the related tickets
> in Trac. For example:
> 
>   * A list of Accepted tickets with no open PR or an open PR that hasn't
> t been touched in X months
>   * A list of Accepted tickets with no PR and no attached patch that
> haven't been touched in  months
> 
> This got me wondering: Is checking for GitHub PRs via JavaScript the
> Right Way to do it? What if we had a cronjob update Trac periodically
> with PR status from GitHub?
> 
> I think it would be valuable to be able to query on PR status from
> within Trac, e.g., to help find in progress but stale/abandoned tickets.
> Cleaning up the work of someone else who's lost interest in a patch is
> often a good way to get into Django development.
> 
> I'm sure there are some holes in this idea, so I'm putting it out there
> for comment. Was something like this considered before, and if so, why
> wasn't it pursued?
> 
> If it hasn't been considered before, what are the obvious problems I
> might encounter?

Others know these systems better than I do, but just a couple thoughts:

1) While being able to query Trac by PR status would be useful, losing
the immediate feedback of "I just created my PR and reloaded the Trac
ticket, and there it is!" would be a significant loss, I think. Delays,
even of just a few minutes, in that sort of UI tend to introduce an "is
this actually working" uncertainty that leads to extra support queries.
So maybe even if you implemented something on the backend, we shouldn't
get rid of the JS code? Or maybe github push hooks could be used to keep
update latency low?

2) I think it was done this way originally because whoever did it was
scared of touching Trac's Python code (with reason), and it was simpler
to just do it in JS, not for any deeper reason.

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/e70012b5-dd03-526a-6f99-a03470332bae%40oddbird.net.
For more options, visit https://groups.google.com/d/optout.


signature.asc
Description: OpenPGP digital signature


GitHub PR status within Trac

2016-07-22 Thread Tobias McNulty
I spent some time during the DjangoCon sprint today looking into
dashboard.djangoproject.com and how it calculates metrics. I was hoping to
add some new metrics that mash up data from GitHub and Trac together. While
technically possible, this breaks down when you want to link out to a list
of the related tickets in Trac. For example:

   - A list of Accepted tickets with no open PR or an open PR that hasn't t
   been touched in X months
   - A list of Accepted tickets with no PR and no attached patch that
   haven't been touched in  months

This got me wondering: Is checking for GitHub PRs via JavaScript the Right
Way to do it? What if we had a cronjob update Trac periodically with PR
status from GitHub?

I think it would be valuable to be able to query on PR status from within
Trac, e.g., to help find in progress but stale/abandoned tickets. Cleaning
up the work of someone else who's lost interest in a patch is often a good
way to get into Django development.

I'm sure there are some holes in this idea, so I'm putting it out there for
comment. Was something like this considered before, and if so, why wasn't
it pursued?

If it hasn't been considered before, what are the obvious problems I might
encounter?

Rather than sync the data periodically, another approach might be to extend
the existing trac-github 
plugin, though that would still require sync'ing existing data up front and
substantial testing to make sure all the right events (e.g., renames and
closures) are caught appropriately. It's not as simple as adding a commit
hash to a ticket's history, esp. if we ever wanted to change the fields
that were brought over from GitHub.

Tobias
-- 


*Tobias McNulty*Chief Executive Officer

tob...@caktusgroup.com
www.caktusgroup.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMGFDKS9LkX6wAXP_gEroQAs_0uf2Q3qLWON_thTupxmDW9XGA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Improve queries on django admin

2016-07-22 Thread Shai Berger
I tend to agree with Tim -- in particular, a query on the admin should only 
return a small (<100) number of records, due to paging; if, for that size of 
query, you see a significant difference between returning all the columns and 
returning just the ones you need, it is suspicious: Either you have a very 
special case (e.g. it just happens that all the fields you chose to list are in 
an index, and so are selected extremely fast), or you're doing something wrong 
(which causes your default query to be very slow).

For most cases, this should not be a significant optimization IMO.

On Friday 22 July 2016 20:38:40 Tim Graham wrote:
> I'm a bit wary of the complexity this would add, especially given this
> warning in the documentation:
> 
> The defer() method (and its cousin, only()
>  ls.query.QuerySet.only>, below) are only for advanced use-cases. They
> provide an optimization for when you have analyzed your queries closely
> and understand *exactly* what information you need and have measured that
> the difference between returning the fields you need and the full set of
> fields for the model will be significant.
> 
> I think it's better if developers override ModelAdmin.get_queryset() as
> needed, as you've done. In particular, I thought of the case where a method
> is used in list_display. In that case, I believe it's impossible to do the
> optimization automatically since Django can't automatically determine what
> fields a method might use.
> 
> On Friday, July 22, 2016 at 9:51:27 AM UTC-4, Rael Max wrote:
> > Hi Lucas, thanks for reply
> > 
> > I think that select_related gives a great improve on performance but we
> > can improve his usage passing the columns that we want retrieve, avoiding
> > getting most columns/data and allocate more memory than necessary.
> > 
> > Em quinta-feira, 21 de julho de 2016 17:01:00 UTC-3, Lucas Magnum 
escreveu:
> >> You can use `list_select_related` for Django Admin too.
> >> 
> >> 
> >> 
> >> 
> >> []'s
> >> 
> >> Lucas Magnum.
> >> 
> >> 2016-07-21 15:52 GMT-03:00 Rael Max :
> >>> Hi everyone,
> >>> 
> >>> I'm working in a project with a large mysql database and i've faced
> >>> with problems generated on django admin list. Basically, the query
> >>> executed to retrieve a list of items from a model uses a SQL SELECT
> >>> passing a list of all attributes of model, but usually we only use a
> >>> small set of them on *list_display* attribute.
> >>> 
> >>> I solved this problem overriding the *queryset* method of *ModelAdmin*
> >>> and using the method only of *QuerySet* using the fields listed on
> >>> *list_display* attribute of *ModelAdmin*. With the limit of columns
> >>> retrieved this queries should to consume less memory to be executed.
> >>> 
> >>> Searching about this here and on django issue tracker i've not found
> >>> nothing about. What you think about this optimization be the default
> >>> behavior or use a *ModelAdmin* attribute to enable?
> >>> 
> >>> Regards,
> >>> Rael


Proposal: Use HTML5 boolean attribute for checked on checkbox/radio inputs

2016-07-22 Thread Jon Dufresne
Hi,

I would like to propose that Django renders the "checked" attribute of
checkbox and radio inputs using the HTML5 boolean style attributes.

Django has supported HTML5 boolean attributes since 1.8 [0]. It has used
them internally for the "disabled" attribute since 1.9 [1] and the
"required" attribute starting with 1.10 [2]. So there is some precedent to
using the HTML5 style. I find the newer style cleaner and more in line with
modern conventions.

I have created a ticket [3] with this proposal as well as a PR [4].

One concern raised in the ticket is backwards compatibility with non-HTML5
doctypes. I'm not aware of any such issues with modern browsers. I have
tested older doctypes on Firefox and Chrome, both accept the HTML5 boolean
style with HTML4 and XHTML doctypes. Currently, I do not have access to IE,
so I am unable to test those cases. If anyone is interested to test, there
is a very simple test case in the ticket.

Additionally, if there is an issue with older doctypes, presumably this
issue already exists with the disabled and required attributes.

Just reaching out for feedback, concerns, and comments.

Thanks!

Cheers,
Jon


[0] https://docs.djangoproject.com/en/dev/releases/1.8/#forms
[1]
https://github.com/django/django/blob/stable/1.9.x/django/forms/boundfield.py#L88-L89
[2]
https://github.com/django/django/blob/stable/1.10.x/django/forms/boundfield.py#L88-L89
[3] Ticket: https://code.djangoproject.com/ticket/26928
[4] PR: https://github.com/django/django/pull/6961

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CADhq2b6WBJfd0_XWPXPbApdG7KkYAYGNK69joOF3uGgcEvqCFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Improve queries on django admin

2016-07-22 Thread Tim Graham
I'm a bit wary of the complexity this would add, especially given this 
warning in the documentation:

The defer() method (and its cousin, only() 
,
 
below) are only for advanced use-cases. They provide an optimization for 
when you have analyzed your queries closely and understand *exactly* what 
information you need and have measured that the difference between 
returning the fields you need and the full set of fields for the model will 
be significant.

I think it's better if developers override ModelAdmin.get_queryset() as 
needed, as you've done. In particular, I thought of the case where a method 
is used in list_display. In that case, I believe it's impossible to do the 
optimization automatically since Django can't automatically determine what 
fields a method might use.

On Friday, July 22, 2016 at 9:51:27 AM UTC-4, Rael Max wrote:
>
> Hi Lucas, thanks for reply
>
> I think that select_related gives a great improve on performance but we 
> can improve his usage passing the columns that we want retrieve, avoiding 
> getting most columns/data and allocate more memory than necessary.
>
> Em quinta-feira, 21 de julho de 2016 17:01:00 UTC-3, Lucas Magnum escreveu:
>>
>> You can use `list_select_related` for Django Admin too.
>>
>>
>>
>>
>> []'s
>>
>> Lucas Magnum.
>>
>> 2016-07-21 15:52 GMT-03:00 Rael Max :
>>
>>> Hi everyone,
>>>
>>> I'm working in a project with a large mysql database and i've faced with 
>>> problems generated on django admin list. Basically, the query executed to 
>>> retrieve a list of items from a model uses a SQL SELECT passing a list of 
>>> all attributes of model, but usually we only use a small set of them on 
>>> *list_display* attribute.
>>>
>>> I solved this problem overriding the *queryset* method of *ModelAdmin* 
>>> and using the method only of *QuerySet* using the fields listed on 
>>> *list_display* attribute of *ModelAdmin*. With the limit of columns 
>>> retrieved this queries should to consume less memory to be executed.
>>>
>>> Searching about this here and on django issue tracker i've not found 
>>> nothing about. What you think about this optimization be the default 
>>> behavior or use a *ModelAdmin* attribute to enable?
>>>
>>> Regards,
>>> Rael
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Django developers (Contributions to Django itself)" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to django-develop...@googlegroups.com.
>>> To post to this group, send email to django-d...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/django-developers.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/django-developers/0fab70ef-2217-4069-9f37-2ec2376626c6%40googlegroups.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/3e0835b1-d561-476e-b7c4-5cf5fe95e414%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Middleware error framework, high level logging for database queries

2016-07-22 Thread Curtis Maloney



On 19/07/16 05:16, Rishi Gupta wrote:

Hi django-developers,

(1) Middleware error framework.

Zulip has some exception middleware to allow 40x errors to be returned
to the user from anywhere within the view code via raising a special
exception, which we’ve found to be a really nice, convenient
programming style.  With this model, validation code called anywhere
from within a view can pass nice, clear user-facing errors up to the
frontend.  You can do this by writing something like:


I did something like this [actually, the code was handed to me by Matt 
Schinckel]... which I use in django-nap.


It's HtppResponse melded with Exception so you can raise it.

I submitted it for consideration some time ago, and it was rejected 
because it bound source and action too tightly.


However, it looks like your solution doesn't suffer from this shortcoming.

As was mentioned by Tim, I think this can readily grow outside django, 
until it's matured.




(2) High level logging for database queries.

We've currently monkey-patched a system to add the following
information to our log lines:

...  52ms (db: 4ms/8q) /url ...


Some years back I wrote some middleware to do just this originally 
logging the count / total time, and later sending it to the browser in a 
cookie with a 0 time to live.



Currently there isn't a great way to do this "natively"; Django’s
database cursors either logs the whole query (in DEBUG mode) or
nothing at all.


In debug mode the queries [and their execution times] are kept on the 
connection.


import time

class CookieDebugMiddleware(object):
'''Show query counts, times, and view timing in Cookies'''

def process_request(self, request):
request.start_time = time.time()

def process_response(self, request, response):
response.set_cookie('QueryCount', '%d (%s s)' % (
len(connection.queries),
sum([float(q['time']) for q in connection.queries])
), max_age=0)
response.set_cookie('ViewTime', '%s s' % (time.time() - 
request.start_time), max_age=0)

return response


--
Curtis

--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/ff91610e-9a8e-2c8a-554b-ca2c880de95b%40tinbrain.net.
For more options, visit https://groups.google.com/d/optout.


Re: Improve queries on django admin

2016-07-22 Thread Rael Max
Hi Lucas, thanks for reply

I think that select_related gives a great improve on performance but we can 
improve his usage passing the columns that we want retrieve, avoiding 
getting most columns/data and allocate more memory than necessary.

Em quinta-feira, 21 de julho de 2016 17:01:00 UTC-3, Lucas Magnum escreveu:
>
> You can use `list_select_related` for Django Admin too.
>
>
>
>
> []'s
>
> Lucas Magnum.
>
> 2016-07-21 15:52 GMT-03:00 Rael Max :
>
>> Hi everyone,
>>
>> I'm working in a project with a large mysql database and i've faced with 
>> problems generated on django admin list. Basically, the query executed to 
>> retrieve a list of items from a model uses a SQL SELECT passing a list of 
>> all attributes of model, but usually we only use a small set of them on 
>> *list_display* attribute.
>>
>> I solved this problem overriding the *queryset* method of *ModelAdmin* 
>> and using the method only of *QuerySet* using the fields listed on 
>> *list_display* attribute of *ModelAdmin*. With the limit of columns 
>> retrieved this queries should to consume less memory to be executed.
>>
>> Searching about this here and on django issue tracker i've not found 
>> nothing about. What you think about this optimization be the default 
>> behavior or use a *ModelAdmin* attribute to enable?
>>
>> Regards,
>> Rael
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to django-develop...@googlegroups.com .
>> To post to this group, send email to django-d...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/django-developers.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/django-developers/0fab70ef-2217-4069-9f37-2ec2376626c6%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/98c1dc46-9544-4752-b4dc-d0c94b3687bd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.