Re: Help interpreting profiler results (or: why is my app so slow?)

2013-04-18 Thread Matt Andrews
I wouldn't say huge (see the page/view in question at 
http://www.scenepointblank.com) -- I grab a bunch of things and then 
display them, mostly truncated (with HTML stripped out, usually for the 
homepage). The custom template tags take a bunch of arguments (say for a 
news post, it takes arguments about where to show the photo, how large to 
make the headline, etc) so there's lots of loading of views inside loops 
for each news story etc.

I've tried experimenting before with the template fragment caching and saw 
little in the way of results.


On Thursday, 18 April 2013 13:40:10 UTC+1, Shawn Milochik wrote:
>
> Hit send too soon:
>
>
> https://docs.djangoproject.com/en/dev/topics/cache/#template-fragment-caching
>
>
> On Thu, Apr 18, 2013 at 8:39 AM, Shawn Milochik 
> <sh...@milochik.com
> > wrote:
>
>> Yes, it does look like template tags are taking some time. Is the page 
>> huge? Are you doing a ton of formatting? Is there something you could maybe 
>> move to server-side?
>>
>> Also, this might help with caching bits of your output: 
>>
>>
>> On Thu, Apr 18, 2013 at 6:17 AM, Matt Andrews 
>> <ma...@mattandrews.info
>> > wrote:
>>
>>>
>>> On Thursday, 18 April 2013 10:45:40 UTC+1, Tom Evans wrote:
>>>
>>> On Wed, Apr 17, 2013 at 11:18 PM, Matt Andrews <ma...@mattandrews.info> 
>>>> wrote: 
>>>> > Hi all. 
>>>> > 
>>>> > Having performance problems with my Django app. I've posted here 
>>>> before 
>>>> > talking about this: one theory for my slowness woes was that I'm 
>>>> using raw 
>>>> > SQL for everything after getting sick of Django doing things weirdly 
>>>> > (duplicating queries, adding bizarre things like "LIMIT 3453453" to 
>>>> queries, 
>>>> > not being able to JOIN things like I wanted etc). I'm not opposed to 
>>>> going 
>>>> > back to the ORM but need to know if this is where my bottleneck is. 
>>>> > 
>>>> > I've run a profiler against my code and the results are here: 
>>>> > http://pastebin.com/raw.php?i=**HQf9bqGp<http://pastebin.com/raw.php?i=HQf9bqGp>
>>>> >  
>>>> > 
>>>> > On my local machine (a not very powerful laptop) I see Django Debug 
>>>> Toolbar 
>>>> > load times of ~1900ms for my site homepage. This includes 168ms of db 
>>>> calls 
>>>> > (11 queries, which I think are fairly well-tuned, indexed, etc). I 
>>>> cache 
>>>> > pretty well on production but load times are still slow -- some of 
>>>> this may 
>>>> > be down to my cheap webhost, though. In my settings I enabled 
>>>> > django.template.loaders.**cached.Loader but this doesn't seem to 
>>>> make much 
>>>> > difference. 
>>>> > 
>>>> > I'm having trouble seeing what the profiler results above are telling 
>>>> me: 
>>>> > can anyone shed any light? 
>>>>
>>>> Most of your time is spent in pprint, which was called over 14,000 
>>>> times to generate your page. Over 2 seconds spent printing out debug. 
>>>> This should be telling you "don't use pprint when you want to see how 
>>>> fast your code is". 
>>>>
>>>> Cheers 
>>>>
>>>> Tom 
>>>>
>>>
>>> Good point Tom, apologies. Here's the profiler results with DebugToolbar 
>>> switched off (and ordered by cumulative time, thanks Shawn!): 
>>> http://pastebin.com/raw.php?i=y3iP0cLn
>>>
>>> The top one is obviously my "home" method inside my views, but I'm 
>>> struggling to get more from it than that. Lots of template rendering, but 
>>> caching not helping here...?
>>>  
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Django users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to django-users...@googlegroups.com .
>>> To post to this group, send email to django...@googlegroups.com
>>> .
>>> Visit this group at http://groups.google.com/group/django-users?hl=en.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>  
>>>  
>>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Help interpreting profiler results (or: why is my app so slow?)

2013-04-18 Thread Matt Andrews

On Thursday, 18 April 2013 10:45:40 UTC+1, Tom Evans wrote:

> On Wed, Apr 17, 2013 at 11:18 PM, Matt Andrews 
> <ma...@mattandrews.info> 
> wrote: 
> > Hi all. 
> > 
> > Having performance problems with my Django app. I've posted here before 
> > talking about this: one theory for my slowness woes was that I'm using 
> raw 
> > SQL for everything after getting sick of Django doing things weirdly 
> > (duplicating queries, adding bizarre things like "LIMIT 3453453" to 
> queries, 
> > not being able to JOIN things like I wanted etc). I'm not opposed to 
> going 
> > back to the ORM but need to know if this is where my bottleneck is. 
> > 
> > I've run a profiler against my code and the results are here: 
> > http://pastebin.com/raw.php?i=HQf9bqGp 
> > 
> > On my local machine (a not very powerful laptop) I see Django Debug 
> Toolbar 
> > load times of ~1900ms for my site homepage. This includes 168ms of db 
> calls 
> > (11 queries, which I think are fairly well-tuned, indexed, etc). I cache 
> > pretty well on production but load times are still slow -- some of this 
> may 
> > be down to my cheap webhost, though. In my settings I enabled 
> > django.template.loaders.cached.Loader but this doesn't seem to make much 
> > difference. 
> > 
> > I'm having trouble seeing what the profiler results above are telling 
> me: 
> > can anyone shed any light? 
>
> Most of your time is spent in pprint, which was called over 14,000 
> times to generate your page. Over 2 seconds spent printing out debug. 
> This should be telling you "don't use pprint when you want to see how 
> fast your code is". 
>
> Cheers 
>
> Tom 
>

Good point Tom, apologies. Here's the profiler results with DebugToolbar 
switched off (and ordered by cumulative time, thanks Shawn!): 
http://pastebin.com/raw.php?i=y3iP0cLn

The top one is obviously my "home" method inside my views, but I'm 
struggling to get more from it than that. Lots of template rendering, but 
caching not helping here...?

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Help interpreting profiler results (or: why is my app so slow?)

2013-04-17 Thread Matt Andrews
Hi all.

Having performance problems with my Django app. I've posted here before 
talking about this: one theory for my slowness woes was that I'm using raw 
SQL for everything after getting sick of Django doing things weirdly 
(duplicating queries, adding bizarre things like "LIMIT 3453453" to 
queries, not being able to JOIN things like I wanted etc). I'm not opposed 
to going back to the ORM but need to know if this is where my bottleneck is.

I've run a profiler against my code and the results are here: 
http://pastebin.com/raw.php?i=HQf9bqGp

On my local machine (a not very powerful laptop) I see Django Debug Toolbar 
load times of ~1900ms for my site homepage. This includes 168ms of db calls 
(11 queries, which I think are fairly well-tuned, indexed, etc). I cache 
pretty well on production but load times are still slow -- some of this may 
be down to my cheap webhost, though. In my settings I 
enabled django.template.loaders.cached.Loader but this doesn't seem to make 
much difference.

I'm having trouble seeing what the profiler results above are telling me: 
can anyone shed any light?

If it helps at all, this is the homepage of my app: 
http://www.scenepointblank.com -- there's lot of HTML processing (stripping 
tags or truncating strings and parsing HTML etc). I use a couple of 
template tags quite frequently to re-use partial views (a news post, for 
example) so I can just pass it an object to work with. I don't know if this 
means a huge template overhead. I've explored replacing Django's templates 
with Jinja2 but fell at the first hurdle when none of the aforementioned 
template tags worked and I couldn't figure out how to port them to Jinja.

Any tips would be hugely appreciated.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Using template fragment caching *inside* a sitewide cache: possible?

2013-02-15 Thread Matt Andrews
I tried using the approach above (template fragment cache everywhere 
instead of sitewide) and the same problem was there: the inner cache didn't 
reflect cachebusting changes.

I ended up rewriting the code to pull the view fragments via AJAX (which 
did respect the cachebusting). You can see the results 
here: http://www.scenepointblank.com/ (it's the "popular right now" tabbed 
widget near the bottom right.

On Thursday, 14 February 2013 17:19:28 UTC, Matt Andrews wrote:
>
> Hi Tom,
>
> Yep, you've got the problem right. Reading it back that way I see the 
> issue much more clearly...
>
> One approach I thought of: what if, instead of the sitewide caching, I 
> used template-fragment caching for *every* view (there aren't that many) 
> and invalidated the relevant ones along with the nested caches when needed 
> (there are probably only five or so views where this would need to happen)?
>
> I'm on shared hosting so I don't think Varnish is going to be a 
> possibility...
>
> thanks!
>
> On Thursday, 14 February 2013 16:02:53 UTC, Tom Evans wrote:
>>
>> On Mon, Feb 11, 2013 at 1:42 PM, Matt Andrews <ma...@mattandrews.info> 
>> wrote: 
>> > Hi all, 
>> > 
>> > I've been experimenting with an expensive query I need to call 
>> (essentially 
>> > grabbing data from some Google APIs). I tried this experiment: 
>> > 
>> > A sitewide cache with a long (days) expiry time 
>> > A template fragment with its own separate cache inside a view cached by 
>> the 
>> > sitewide cache -- this fragment simply displays the current time. 
>> > A view function which clears the named template fragment cache. 
>> > 
>> > The behaviour I expected was that on first pageload, the fragment would 
>> > display the time of page render, and all subsequent reloads would 
>> display 
>> > the same time. This was true. 
>> > 
>> > The second part, though, was that I expected to be able to call the 
>> view 
>> > function to clear the fragment cache and then reload the page to see 
>> the 
>> > time update. This didn't happen. 
>> > 
>> > Is it possible to achieve this behaviour? Essentially I want to run a 
>> > background cron task which just hits the Google API and updates my 
>> cached 
>> > fragments - the Google data changes every 15 minutes but the sitewide 
>> cache 
>> > has several hours timeout, normally. 
>> > 
>> > Hope this makes sense. 
>> > 
>> > Matt 
>> > 
>>
>> Hi Matt 
>>
>> Can I restate the problem to see if I am getting it right? 
>>
>> You have a page fragment, 'FRAG', stored in the cache. 
>> You have a page, 'PAGE', using that fragment stored in the cache. 
>> You update 'FRAG' in the cache. 
>> You reload 'PAGE', and it has the old contents of 'FRAG' instead of 
>> the updated one. 
>>
>> If so, this is to be expected. The site cache caches entire responses, 
>> with a key derived from that request. When a new request comes in, a 
>> key is generated from that request. If the key is found in the cache, 
>> the cached response is returned. 
>>
>> At no point does the cache know that one cache entry is built using 
>> another, or that invalidating the fragment should invalidate any page 
>> containing that fragment. 
>>
>> One potential solution is to use something like Varnish, an HTTP 
>> accelerator/cache that has support for ESI - Edge Side Includes - that 
>> allow you to build up responses from multiple components, and 
>> intelligently cache the components. Eg, an example of what you are 
>> trying to do in Varnish using ESI: 
>>
>> https://www.varnish-cache.org/trac/wiki/ESIfeatures#Anesi:includeexample 
>>
>> Alternatively, I may have completely misunderstood this :) 
>>
>> Cheers 
>>
>> Tom 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Using template fragment caching *inside* a sitewide cache: possible?

2013-02-14 Thread Matt Andrews
Hi Tom,

Yep, you've got the problem right. Reading it back that way I see the issue 
much more clearly...

One approach I thought of: what if, instead of the sitewide caching, I used 
template-fragment caching for *every* view (there aren't that many) and 
invalidated the relevant ones along with the nested caches when needed 
(there are probably only five or so views where this would need to happen)?

I'm on shared hosting so I don't think Varnish is going to be a 
possibility...

thanks!

On Thursday, 14 February 2013 16:02:53 UTC, Tom Evans wrote:
>
> On Mon, Feb 11, 2013 at 1:42 PM, Matt Andrews 
> <ma...@mattandrews.info> 
> wrote: 
> > Hi all, 
> > 
> > I've been experimenting with an expensive query I need to call 
> (essentially 
> > grabbing data from some Google APIs). I tried this experiment: 
> > 
> > A sitewide cache with a long (days) expiry time 
> > A template fragment with its own separate cache inside a view cached by 
> the 
> > sitewide cache -- this fragment simply displays the current time. 
> > A view function which clears the named template fragment cache. 
> > 
> > The behaviour I expected was that on first pageload, the fragment would 
> > display the time of page render, and all subsequent reloads would 
> display 
> > the same time. This was true. 
> > 
> > The second part, though, was that I expected to be able to call the view 
> > function to clear the fragment cache and then reload the page to see the 
> > time update. This didn't happen. 
> > 
> > Is it possible to achieve this behaviour? Essentially I want to run a 
> > background cron task which just hits the Google API and updates my 
> cached 
> > fragments - the Google data changes every 15 minutes but the sitewide 
> cache 
> > has several hours timeout, normally. 
> > 
> > Hope this makes sense. 
> > 
> > Matt 
> > 
>
> Hi Matt 
>
> Can I restate the problem to see if I am getting it right? 
>
> You have a page fragment, 'FRAG', stored in the cache. 
> You have a page, 'PAGE', using that fragment stored in the cache. 
> You update 'FRAG' in the cache. 
> You reload 'PAGE', and it has the old contents of 'FRAG' instead of 
> the updated one. 
>
> If so, this is to be expected. The site cache caches entire responses, 
> with a key derived from that request. When a new request comes in, a 
> key is generated from that request. If the key is found in the cache, 
> the cached response is returned. 
>
> At no point does the cache know that one cache entry is built using 
> another, or that invalidating the fragment should invalidate any page 
> containing that fragment. 
>
> One potential solution is to use something like Varnish, an HTTP 
> accelerator/cache that has support for ESI - Edge Side Includes - that 
> allow you to build up responses from multiple components, and 
> intelligently cache the components. Eg, an example of what you are 
> trying to do in Varnish using ESI: 
>
> https://www.varnish-cache.org/trac/wiki/ESIfeatures#Anesi:includeexample 
>
> Alternatively, I may have completely misunderstood this :) 
>
> Cheers 
>
> Tom 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Using template fragment caching *inside* a sitewide cache: possible?

2013-02-14 Thread Matt Andrews
Apologies for bumping this thread, but is there anybody with any insight on 
this? Really driving me crazy!

On Monday, 11 February 2013 13:42:05 UTC, Matt Andrews wrote:
>
> Hi all,
>
> I've been experimenting with an expensive query I need to call 
> (essentially grabbing data from some Google APIs). I tried this experiment:
>
>- A sitewide cache with a long (days) expiry time
>- A template fragment with its own separate cache *inside* a view 
>cached by the sitewide cache -- this fragment simply displays the current 
>time.
>- A view function which clears the named template fragment cache.
>
> The behaviour I expected was that on first pageload, the fragment would 
> display the time of page render, and all subsequent reloads would display 
> the same time. This was true.
>
> The second part, though, was that I expected to be able to call the view 
> function to clear the fragment cache and then reload the page to see the 
> time update. This didn't happen.
>
> Is it possible to achieve this behaviour? Essentially I want to run a 
> background cron task which just hits the Google API and updates my cached 
> fragments - the Google data changes every 15 minutes but the sitewide cache 
> has several hours timeout, normally.
>
> Hope this makes sense.
>
> Matt
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Using template fragment caching *inside* a sitewide cache: possible?

2013-02-11 Thread Matt Andrews
Hi all,

I've been experimenting with an expensive query I need to call (essentially 
grabbing data from some Google APIs). I tried this experiment:

   - A sitewide cache with a long (days) expiry time
   - A template fragment with its own separate cache *inside* a view cached 
   by the sitewide cache -- this fragment simply displays the current time.
   - A view function which clears the named template fragment cache.

The behaviour I expected was that on first pageload, the fragment would 
display the time of page render, and all subsequent reloads would display 
the same time. This was true.

The second part, though, was that I expected to be able to call the view 
function to clear the fragment cache and then reload the page to see the 
time update. This didn't happen.

Is it possible to achieve this behaviour? Essentially I want to run a 
background cron task which just hits the Google API and updates my cached 
fragments - the Google data changes every 15 minutes but the sitewide cache 
has several hours timeout, normally.

Hope this makes sense.

Matt

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Busting template fragment caches inside a sitewide cache: possible?

2013-02-10 Thread Matt Andrews
Hi all.

I have a site which makes use of Google's Analytics API which allows me to 
grab the most-viewed content on my site and then show it in various ways 
around the site ("most popular articles" etc). The query against this API 
is slow (takes a couple of seconds) so I've been caching it as described in 
my StackOverflow question about the technique 
here
,

The basic process is this:

   - A cron job hits a URL endpoint every 15 minutes
   - This URL makes the API request and caches the results (for 1 hour, in 
   case the cron ever fails)
   - Meanwhile, my app's views request this data and get served the cached 
   data.
   - In my views, I wrap all usage of this data in named template fragment 
   caches
   - The cron endpoint clears both the high-level cache (which the views 
   call) and the template fragment caches
   - I use a site-wide cache for the entire app.

This kind of works, but occasionally results in blank spaces where the data 
should be -- running the cron job manually has no effect on this. 
Occasionally, experimenting with restarting the server, clearing the app's 
cache and immediately calling the cron endpoints can fix things. After an 
hour or so things tend to work themselves out but it seems that the 15 
minute updates don't actually reflect in the templates.

The StackOverflow advice I got seemed to suggest that using template 
fragment caching *inside* a sitewide cached view should respect the inner 
cache setting, but it doesn't seem to.

Can anyone suggest a strategy for making this work reliably? Every time I 
deploy a new build of the app (and restart the server) it wipes the cache 
and pinging the update URL does nothing. I'm sure my logic is just a little 
wrong.

thanks
Matt

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Matt Andrews
Yes, local app, local DB (which is an exact copy of the production DB). 
Interestingly, I just tried pointing the local app at the remote database 
and total query time (for 10 queries) went from 115ms to 3477ms! Really odd.

On Tuesday, January 22, 2013 7:54:53 PM UTC, Nikolas Stevenson-Molnar wrote:
>
>  When running locally, are you also using a local database instance? IIRC 
> the time that phpMyAdmin reports is the time taken for the database to 
> execute the query, whereas the time reported by the debug toolbar (I *
> think*) is complete round-trip (send query, execute, return results). So 
> if there is latency between your application and your database, that could 
> account for the discrepancy.
>
> The debug toolbar *does* add overhead to your request (about doubles the 
> page load time), but the numbers it reports should be more or less accurate 
> (iow a page might take 1s to load with the toolbar, but the toolbar will 
> report 500ms, which is accurate for load-time *without* the toolbar).
>
> _Nik
>
> On 1/22/2013 9:26 AM, Matt Andrews wrote:
>  
> Hi Nik, 
>
>  I see the discrepancy both locally and when deployed, too -- that's 
> what's so puzzling. If it helps, I'm using MySQL and running it on Windows 
> locally and under Passenger on my remote Linux server (also on MySQL).
>
>  Only other thing I can think of is that this "overhead" might literally 
> be the extra work Django Debug Toolbar does to show queries etc. Obviously 
> when I turn the toolbar off then I can't see (directly) the SQL times 
> Django is getting, but could be a factor? Either way, with the toolbar off 
> on production, there's definitely a slight "lag" before page requests start 
> to kick in -- maybe this is unrelated to the SQL, though.
>
>
> On Tuesday, 22 January 2013 16:42:19 UTC, Nikolas Stevenson-Molnar wrote: 
>>
>> Hi Matt, 
>>
>>  It's unlikely the problem lies with dictfetchall. The "small 
>> performance hit" mentioned by the docs probably refers to the fact that the 
>> function has a time complexity of (number of colulmns) * (number of rows). 
>> Since you're only getting 10 rows back, I can't see that having an 
>> appreciable impact.
>>
>>  I haven't seen this kind of variation between performing a direct 
>> query, and executing the query as part of a Django request… especially with 
>> such a basic query. Do you see the same difference in timing locally or is 
>> it only when deployed?
>>
>>  _Nik
>>
>>  On Jan 22, 2013, at 4:47 AM, Matt Andrews <ma...@mattandrews.info> 
>> wrote:
>>
>> That looks really useful - for future reference, great! Sadly my webhost 
>> is still bundling Django 1.2.3 :( 
>>
>>  In general, though, these issues still come down to the fact that a SQL 
>> query as executed by Django takes several times longer than going directly 
>> to the DB. Is this down to my use of dictfetchall() or something more low 
>> level?
>>
>>
>> On Tuesday, January 22, 2013 12:09:50 PM UTC, Jani Tiainen wrote: 
>>>
>>> Hi, 
>>>
>>> I see that you had quite a bunch of m2m keys. 
>>>
>>> Have you tried recent version of Django (1.4+) with prefetch_related() 
>>> [1] ? 
>>>
>>>
>>> [1] 
>>> <
>>> https://docs.djangoproject.com/en/1.4/ref/models/querysets/#prefetch-related>
>>>  
>>>
>>>
>>>
>>> 22.1.2013 13:57, Matt Andrews kirjoitti: 
>>> > Hi Jani, 
>>> > 
>>> > I made a StackOverflow post last year with an example of the ORM stuff 
>>> I 
>>> > tried and the poor queries it produced: 
>>> > 
>>> http://stackoverflow.com/questions/5843457/django-objects-all-making-360-queries-how-can-i-optimise-this-manytomany
>>>  
>>> > 
>>> > There's also this discussion about how using the same queryset in two 
>>> > places in the template caused Django to request its data twice: 
>>> > 
>>> http://stackoverflow.com/questions/9447053/best-way-to-slice-a-django-queryset-without-hitting-the-database-more-than-once(this
>>>  
>>> > was easily fixed, but again, not very intuitive and frustrated me) 
>>> > 
>>> > I don't have the data to hand currently but I also remember seeing 
>>> weird 
>>> > things happen where queries would end with stuff like "... LIMIT 
>>> > 234423445" (or some crazy number which I'd never entered and was 
>>> orders 
>>> > of magnitude bigger than the number of rows in the table). 

Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Matt Andrews
Hi Nik,

I see the discrepancy both locally and when deployed, too -- that's what's 
so puzzling. If it helps, I'm using MySQL and running it on Windows locally 
and under Passenger on my remote Linux server (also on MySQL).

Only other thing I can think of is that this "overhead" might literally be 
the extra work Django Debug Toolbar does to show queries etc. Obviously 
when I turn the toolbar off then I can't see (directly) the SQL times 
Django is getting, but could be a factor? Either way, with the toolbar off 
on production, there's definitely a slight "lag" before page requests start 
to kick in -- maybe this is unrelated to the SQL, though.


On Tuesday, 22 January 2013 16:42:19 UTC, Nikolas Stevenson-Molnar wrote:
>
> Hi Matt,
>
> It's unlikely the problem lies with dictfetchall. The "small performance 
> hit" mentioned by the docs probably refers to the fact that the function 
> has a time complexity of (number of colulmns) * (number of rows). Since 
> you're only getting 10 rows back, I can't see that having an appreciable 
> impact.
>
> I haven't seen this kind of variation between performing a direct query, 
> and executing the query as part of a Django request… especially with such a 
> basic query. Do you see the same difference in timing locally or is it only 
> when deployed?
>
> _Nik
>
> On Jan 22, 2013, at 4:47 AM, Matt Andrews 
> <ma...@mattandrews.info> 
> wrote:
>
> That looks really useful - for future reference, great! Sadly my webhost 
> is still bundling Django 1.2.3 :(
>
> In general, though, these issues still come down to the fact that a SQL 
> query as executed by Django takes several times longer than going directly 
> to the DB. Is this down to my use of dictfetchall() or something more low 
> level?
>
>
> On Tuesday, January 22, 2013 12:09:50 PM UTC, Jani Tiainen wrote:
>>
>> Hi, 
>>
>> I see that you had quite a bunch of m2m keys. 
>>
>> Have you tried recent version of Django (1.4+) with prefetch_related() 
>> [1] ? 
>>
>>
>> [1] 
>> <
>> https://docs.djangoproject.com/en/1.4/ref/models/querysets/#prefetch-related>
>>  
>>
>>
>>
>> 22.1.2013 13:57, Matt Andrews kirjoitti: 
>> > Hi Jani, 
>> > 
>> > I made a StackOverflow post last year with an example of the ORM stuff 
>> I 
>> > tried and the poor queries it produced: 
>> > 
>> http://stackoverflow.com/questions/5843457/django-objects-all-making-360-queries-how-can-i-optimise-this-manytomany
>>  
>> > 
>> > There's also this discussion about how using the same queryset in two 
>> > places in the template caused Django to request its data twice: 
>> > 
>> http://stackoverflow.com/questions/9447053/best-way-to-slice-a-django-queryset-without-hitting-the-database-more-than-once(this
>>  
>> > was easily fixed, but again, not very intuitive and frustrated me) 
>> > 
>> > I don't have the data to hand currently but I also remember seeing 
>> weird 
>> > things happen where queries would end with stuff like "... LIMIT 
>> > 234423445" (or some crazy number which I'd never entered and was orders 
>> > of magnitude bigger than the number of rows in the table). 
>> > 
>> > I'm aware these are probably edge cases that are down to my own novice 
>> > status, but even using things like select_related(), it wasn't doing 
>> > what I wanted. I just felt it easier to use my existing SQL (I'm 
>> > converting a PHP app over to Django) and I'm not concerned about 
>> > database portability (switching to postgres or whatever). 
>> > 
>> > Nik: just realised I missed your final question. For the SQL posted 
>> > above, the numbers are approximately: 12,000 rows in the `news` table, 
>> > maybe 10 `news_category` rows, about 100 `writers` and around 3000 
>> > `images`. All properly indexed and with sensible column types. 
>> > 
>> > On Tuesday, January 22, 2013 10:53:40 AM UTC, Jani Tiainen wrote: 
>> > 
>> > Hi, 
>> > 
>> >   From your raw SQL I saw you're doing few joins. So I suppose you 
>> do 
>> > quite a few foreign key fetches. 
>> > 
>> > You didn't mention anything how you originally tried to solve case 
>> with 
>> > ORM. Could you please publish what you had when things were slow? 
>> > 
>> > 22.1.2013 12:26, Matt Andrews kirjoitti: 
>> >  > Hi Nik, 
>> >  > 
>> >  > Thanks - I do feel like by circumventing the ORM I've just 
>>

Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Matt Andrews
That looks really useful - for future reference, great! Sadly my webhost is 
still bundling Django 1.2.3 :(

In general, though, these issues still come down to the fact that a SQL 
query as executed by Django takes several times longer than going directly 
to the DB. Is this down to my use of dictfetchall() or something more low 
level?


On Tuesday, January 22, 2013 12:09:50 PM UTC, Jani Tiainen wrote:
>
> Hi, 
>
> I see that you had quite a bunch of m2m keys. 
>
> Have you tried recent version of Django (1.4+) with prefetch_related() [1] 
> ? 
>
>
> [1] 
> <
> https://docs.djangoproject.com/en/1.4/ref/models/querysets/#prefetch-related> 
>
>
>
> 22.1.2013 13:57, Matt Andrews kirjoitti: 
> > Hi Jani, 
> > 
> > I made a StackOverflow post last year with an example of the ORM stuff I 
> > tried and the poor queries it produced: 
> > 
> http://stackoverflow.com/questions/5843457/django-objects-all-making-360-queries-how-can-i-optimise-this-manytomany
>  
> > 
> > There's also this discussion about how using the same queryset in two 
> > places in the template caused Django to request its data twice: 
> > 
> http://stackoverflow.com/questions/9447053/best-way-to-slice-a-django-queryset-without-hitting-the-database-more-than-once(this
>  
> > was easily fixed, but again, not very intuitive and frustrated me) 
> > 
> > I don't have the data to hand currently but I also remember seeing weird 
> > things happen where queries would end with stuff like "... LIMIT 
> > 234423445" (or some crazy number which I'd never entered and was orders 
> > of magnitude bigger than the number of rows in the table). 
> > 
> > I'm aware these are probably edge cases that are down to my own novice 
> > status, but even using things like select_related(), it wasn't doing 
> > what I wanted. I just felt it easier to use my existing SQL (I'm 
> > converting a PHP app over to Django) and I'm not concerned about 
> > database portability (switching to postgres or whatever). 
> > 
> > Nik: just realised I missed your final question. For the SQL posted 
> > above, the numbers are approximately: 12,000 rows in the `news` table, 
> > maybe 10 `news_category` rows, about 100 `writers` and around 3000 
> > `images`. All properly indexed and with sensible column types. 
> > 
> > On Tuesday, January 22, 2013 10:53:40 AM UTC, Jani Tiainen wrote: 
> > 
> > Hi, 
> > 
> >   From your raw SQL I saw you're doing few joins. So I suppose you 
> do 
> > quite a few foreign key fetches. 
> > 
> > You didn't mention anything how you originally tried to solve case 
> with 
> > ORM. Could you please publish what you had when things were slow? 
> > 
> > 22.1.2013 12:26, Matt Andrews kirjoitti: 
> >  > Hi Nik, 
> >  > 
> >  > Thanks - I do feel like by circumventing the ORM I've just "given 
> > up" 
> >  > and perhaps I'll reconsider -- none of my queries are 
> particularly 
> >  > "specialist" (as the sample above indicates) - I just found 
> Django 
> >  > generating odd things. 
> >  > 
> >  > To answer your questions: 
> >  > 
> >  > 1. Yes, reloading the page sees the same time for the queries (it 
> > just 
> >  > feels as though the entire process takes a second or two to 
> > start, which 
> >  > is perhaps not related to SQL itself). 
> >  > 
> >  > 2. I believe so, yes (it's shared hosting...). 
> >  > 
> >  > If you're curious, you can see a sample of the app at 
> >  > http://beta.scenepointblank.com <http://beta.scenepointblank.com> 
>
> > (obviously you won't see the SQL, but 
> >  > the "delay" between pages, even though these pages are all cached 
> > for 
> >  > 2hrs+, is partly my concern here). 
> >  > 
> >  > On Tuesday, January 22, 2013 4:24:09 AM UTC, Nikolas 
> > Stevenson-Molnar wrote: 
> >  > 
> >  > Hi Matt, 
> >  > 
> >  > Firstly, I encourage you to have another crack a the ORM. It 
> can 
> >  > certainly seem a bit aggravating at times if you're coming 
> > from a 
> >  > SQL mindset, but really pays off down the road in terms of 
> >  > maintainability and readability. Typically you should only 
> > need raw 
> >  > queries in Django in cases where you have super-specialized 

Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Matt Andrews
Hi Jani,

I made a StackOverflow post last year with an example of the ORM stuff I 
tried and the poor queries it produced: 
http://stackoverflow.com/questions/5843457/django-objects-all-making-360-queries-how-can-i-optimise-this-manytomany

There's also this discussion about how using the same queryset in two 
places in the template caused Django to request its data twice: 
http://stackoverflow.com/questions/9447053/best-way-to-slice-a-django-queryset-without-hitting-the-database-more-than-once
 (this 
was easily fixed, but again, not very intuitive and frustrated me)

I don't have the data to hand currently but I also remember seeing weird 
things happen where queries would end with stuff like "... LIMIT 234423445" 
(or some crazy number which I'd never entered and was orders of magnitude 
bigger than the number of rows in the table).

I'm aware these are probably edge cases that are down to my own novice 
status, but even using things like select_related(), it wasn't doing what I 
wanted. I just felt it easier to use my existing SQL (I'm converting a PHP 
app over to Django) and I'm not concerned about database portability 
(switching to postgres or whatever). 

Nik: just realised I missed your final question. For the SQL posted above, 
the numbers are approximately: 12,000 rows in the `news` table, maybe 10 
`news_category` rows, about 100 `writers` and around 3000 `images`. All 
properly indexed and with sensible column types. 

On Tuesday, January 22, 2013 10:53:40 AM UTC, Jani Tiainen wrote:
>
> Hi, 
>
>  From your raw SQL I saw you're doing few joins. So I suppose you do 
> quite a few foreign key fetches. 
>
> You didn't mention anything how you originally tried to solve case with 
> ORM. Could you please publish what you had when things were slow? 
>
> 22.1.2013 12:26, Matt Andrews kirjoitti: 
> > Hi Nik, 
> > 
> > Thanks - I do feel like by circumventing the ORM I've just "given up" 
> > and perhaps I'll reconsider -- none of my queries are particularly 
> > "specialist" (as the sample above indicates) - I just found Django 
> > generating odd things. 
> > 
> > To answer your questions: 
> > 
> > 1. Yes, reloading the page sees the same time for the queries (it just 
> > feels as though the entire process takes a second or two to start, which 
> > is perhaps not related to SQL itself). 
> > 
> > 2. I believe so, yes (it's shared hosting...). 
> > 
> > If you're curious, you can see a sample of the app at 
> > http://beta.scenepointblank.com (obviously you won't see the SQL, but 
> > the "delay" between pages, even though these pages are all cached for 
> > 2hrs+, is partly my concern here). 
> > 
> > On Tuesday, January 22, 2013 4:24:09 AM UTC, Nikolas Stevenson-Molnar 
> wrote: 
> > 
> > Hi Matt, 
> > 
> > Firstly, I encourage you to have another crack a the ORM. It can 
> > certainly seem a bit aggravating at times if you're coming from a 
> > SQL mindset, but really pays off down the road in terms of 
> > maintainability and readability. Typically you should only need raw 
> > queries in Django in cases where you have super-specialized (that 
> > uses views or non-standard functions) queries or need some specific 
> > optimization. If there's really no way to perform many of your 
> > "day-to-day" queries with the ORM then that's an indication that a 
> > different database design may fit your data model better. I 
> > understand that you may have a unique situation, but I just wanted 
> > to throw that out there as I personally find the ORM to be a huge 
> > time saver. 
> > 
> > Now, with that out of the way... a couple of considerations: 1) you 
> > say it's a slow "startup"; if you refresh the page do the queries 
> > run just as slow the second time around? and 2) are your Django app 
> > and phpMyAdmin running on the same machine? If not, could transit 
> > time be an issue? 
> > 
> > Finally, can you give an idea about the size of the tables in 
> > question? How many rows in each? 
> > 
> > _Nik 
> > 
> > On 1/21/2013 3:25 PM, Matt Andrews wrote: 
> >> Hi all, 
> >> 
> >> Fairly new to Django. I ended up pulling out all of the 
> >> ORM-generated queries and writing my own SQL directly (I got fed 
> >> up trying to work out how to achieve the kind of things I needed 
> >> without Django adding in extra joins or unintended WHERE clauses 
> >> etc). All my app's SQL uses cursor.execute() and the 
> >> dictfetchall(

Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Matt Andrews
Hi Nik,

Thanks - I do feel like by circumventing the ORM I've just "given up" and 
perhaps I'll reconsider -- none of my queries are particularly "specialist" 
(as the sample above indicates) - I just found Django generating odd things.

To answer your questions:

1. Yes, reloading the page sees the same time for the queries (it just 
feels as though the entire process takes a second or two to start, which is 
perhaps not related to SQL itself).

2. I believe so, yes (it's shared hosting...).

If you're curious, you can see a sample of the app at 
http://beta.scenepointblank.com (obviously you won't see the SQL, but the 
"delay" between pages, even though these pages are all cached for 2hrs+, is 
partly my concern here).

On Tuesday, January 22, 2013 4:24:09 AM UTC, Nikolas Stevenson-Molnar wrote:
>
>  Hi Matt,
>
> Firstly, I encourage you to have another crack a the ORM. It can certainly 
> seem a bit aggravating at times if you're coming from a SQL mindset, but 
> really pays off down the road in terms of maintainability and readability. 
> Typically you should only need raw queries in Django in cases where you 
> have super-specialized (that uses views or non-standard functions) queries 
> or need some specific optimization. If there's really no way to perform 
> many of your "day-to-day" queries with the ORM then that's an indication 
> that a different database design may fit your data model better. I 
> understand that you may have a unique situation, but I just wanted to throw 
> that out there as I personally find the ORM to be a huge time saver.
>
> Now, with that out of the way... a couple of considerations: 1) you say 
> it's a slow "startup"; if you refresh the page do the queries run just as 
> slow the second time around? and 2) are your Django app and phpMyAdmin 
> running on the same machine? If not, could transit time be an issue?
>
> Finally, can you give an idea about the size of the tables in question? 
> How many rows in each?
>
> _Nik
>
> On 1/21/2013 3:25 PM, Matt Andrews wrote:
>  
> Hi all, 
>
>  Fairly new to Django. I ended up pulling out all of the ORM-generated 
> queries and writing my own SQL directly (I got fed up trying to work out 
> how to achieve the kind of things I needed without Django adding in extra 
> joins or unintended WHERE clauses etc). All my app's SQL uses 
> cursor.execute() and the dictfetchall() method as referenced 
> here<https://docs.djangoproject.com/en/dev/topics/db/sql/#django.db.models.Manager.raw>
> . 
>
>  I've found that my app incurs a couple of seconds load time in 
> production, with CPU time at 2532ms and overall time 4684ms (according to 
> the debug toolbar). I'm seeing 8 SQL queries take 380ms, and each one seems 
> to be several times slower when made by Django versus hitting the database 
> through phpMyAdmin or something: eg, this query:
>
>  SELECT * FROM news 
> JOIN news_categories ON news.news_category_id = news_categories.id 
> LEFT JOIN writers ON news.writer_id = writers.id 
> LEFT JOIN images ON news.image_id = images.id 
> ORDER BY news.is_sticky DESC, news.date_posted DESC 
> LIMIT 10
>
>
>  This takes 14.8ms when run in phpMyAdmin (against the production 
> database) but Django reports it as 85.2ms. The same ratios are true for all 
> my other queries.
>
>  All I can think of is the note on the dictfetchall() method in the 
> Django docs which describes a "small performance hit". Is this it?!
>
>  I've profiled the app too, although I'm a bit hazy about what it all 
> means. Here's a dump of the result: 
> http://pastebin.com/raw.php?i=UHE9edVC(this is from running on my local 
> server rather than production but 
> performance is broadly similar).
>
>  Can anyone help me? I realise I've perhaps gone off-piste by writing raw 
> SQL but I feel it was justified.
>
>  thanks,
> Matt
>
>  
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "Django users" group.
> To view this discussion on the web visit 
> https://groups.google.com/d/msg/django-users/-/Qiley9RqfngJ.
> To post to this group, send email to django...@googlegroups.com
> .
> To unsubscribe from this group, send email to 
> django-users...@googlegroups.com .
> For more options, visit this group at 
> http://groups.google.com/group/django-users?hl=en.
>
>
>  

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-users/-/mRrUOi_UcswJ.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-21 Thread Matt Andrews
Hi all,

Fairly new to Django. I ended up pulling out all of the ORM-generated 
queries and writing my own SQL directly (I got fed up trying to work out 
how to achieve the kind of things I needed without Django adding in extra 
joins or unintended WHERE clauses etc). All my app's SQL uses 
cursor.execute() and the dictfetchall() method as referenced 
here
. 

I've found that my app incurs a couple of seconds load time in production, 
with CPU time at 2532ms and overall time 4684ms (according to the debug 
toolbar). I'm seeing 8 SQL queries take 380ms, and each one seems to be 
several times slower when made by Django versus hitting the database 
through phpMyAdmin or something: eg, this query:

SELECT * FROM news 
JOIN news_categories ON news.news_category_id = news_categories.id 
LEFT JOIN writers ON news.writer_id = writers.id 
LEFT JOIN images ON news.image_id = images.id 
ORDER BY news.is_sticky DESC, news.date_posted DESC 
LIMIT 10


This takes 14.8ms when run in phpMyAdmin (against the production database) 
but Django reports it as 85.2ms. The same ratios are true for all my other 
queries.

All I can think of is the note on the dictfetchall() method in the Django 
docs which describes a "small performance hit". Is this it?!

I've profiled the app too, although I'm a bit hazy about what it all means. 
Here's a dump of the result: http://pastebin.com/raw.php?i=UHE9edVC (this 
is from running on my local server rather than production but performance 
is broadly similar).

Can anyone help me? I realise I've perhaps gone off-piste by writing raw 
SQL but I feel it was justified.

thanks,
Matt


-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-users/-/Qiley9RqfngJ.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.