Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Matt Andrews
Yes, local app, local DB (which is an exact copy of the production DB). 
Interestingly, I just tried pointing the local app at the remote database 
and total query time (for 10 queries) went from 115ms to 3477ms! Really odd.

On Tuesday, January 22, 2013 7:54:53 PM UTC, Nikolas Stevenson-Molnar wrote:
>
>  When running locally, are you also using a local database instance? IIRC 
> the time that phpMyAdmin reports is the time taken for the database to 
> execute the query, whereas the time reported by the debug toolbar (I *
> think*) is complete round-trip (send query, execute, return results). So 
> if there is latency between your application and your database, that could 
> account for the discrepancy.
>
> The debug toolbar *does* add overhead to your request (about doubles the 
> page load time), but the numbers it reports should be more or less accurate 
> (iow a page might take 1s to load with the toolbar, but the toolbar will 
> report 500ms, which is accurate for load-time *without* the toolbar).
>
> _Nik
>
> On 1/22/2013 9:26 AM, Matt Andrews wrote:
>  
> Hi Nik, 
>
>  I see the discrepancy both locally and when deployed, too -- that's 
> what's so puzzling. If it helps, I'm using MySQL and running it on Windows 
> locally and under Passenger on my remote Linux server (also on MySQL).
>
>  Only other thing I can think of is that this "overhead" might literally 
> be the extra work Django Debug Toolbar does to show queries etc. Obviously 
> when I turn the toolbar off then I can't see (directly) the SQL times 
> Django is getting, but could be a factor? Either way, with the toolbar off 
> on production, there's definitely a slight "lag" before page requests start 
> to kick in -- maybe this is unrelated to the SQL, though.
>
>
> On Tuesday, 22 January 2013 16:42:19 UTC, Nikolas Stevenson-Molnar wrote: 
>>
>> Hi Matt, 
>>
>>  It's unlikely the problem lies with dictfetchall. The "small 
>> performance hit" mentioned by the docs probably refers to the fact that the 
>> function has a time complexity of (number of colulmns) * (number of rows). 
>> Since you're only getting 10 rows back, I can't see that having an 
>> appreciable impact.
>>
>>  I haven't seen this kind of variation between performing a direct 
>> query, and executing the query as part of a Django request… especially with 
>> such a basic query. Do you see the same difference in timing locally or is 
>> it only when deployed?
>>
>>  _Nik
>>
>>  On Jan 22, 2013, at 4:47 AM, Matt Andrews  
>> wrote:
>>
>> That looks really useful - for future reference, great! Sadly my webhost 
>> is still bundling Django 1.2.3 :( 
>>
>>  In general, though, these issues still come down to the fact that a SQL 
>> query as executed by Django takes several times longer than going directly 
>> to the DB. Is this down to my use of dictfetchall() or something more low 
>> level?
>>
>>
>> On Tuesday, January 22, 2013 12:09:50 PM UTC, Jani Tiainen wrote: 
>>>
>>> Hi, 
>>>
>>> I see that you had quite a bunch of m2m keys. 
>>>
>>> Have you tried recent version of Django (1.4+) with prefetch_related() 
>>> [1] ? 
>>>
>>>
>>> [1] 
>>> <
>>> https://docs.djangoproject.com/en/1.4/ref/models/querysets/#prefetch-related>
>>>  
>>>
>>>
>>>
>>> 22.1.2013 13:57, Matt Andrews kirjoitti: 
>>> > Hi Jani, 
>>> > 
>>> > I made a StackOverflow post last year with an example of the ORM stuff 
>>> I 
>>> > tried and the poor queries it produced: 
>>> > 
>>> http://stackoverflow.com/questions/5843457/django-objects-all-making-360-queries-how-can-i-optimise-this-manytomany
>>>  
>>> > 
>>> > There's also this discussion about how using the same queryset in two 
>>> > places in the template caused Django to request its data twice: 
>>> > 
>>> http://stackoverflow.com/questions/9447053/best-way-to-slice-a-django-queryset-without-hitting-the-database-more-than-once(this
>>>  
>>> > was easily fixed, but again, not very intuitive and frustrated me) 
>>> > 
>>> > I don't have the data to hand currently but I also remember seeing 
>>> weird 
>>> > things happen where queries would end with stuff like "... LIMIT 
>>> > 234423445" (or some crazy number which I'd never entered and was 
>>> orders 
>>> > of magnitude bigger than the number of rows in the table). 
>>> > 
>>> > I'm aware these are probably edge cases that are down to my own novice 
>>> > status, but even using things like select_related(), it wasn't doing 
>>> > what I wanted. I just felt it easier to use my existing SQL (I'm 
>>> > converting a PHP app over to Django) and I'm not concerned about 
>>> > database portability (switching to postgres or whatever). 
>>> > 
>>> > Nik: just realised I missed your final question. For the SQL posted 
>>> > above, the numbers are approximately: 12,000 rows in the `news` table, 
>>> > maybe 10 `news_category` rows, about 100 `writers` and around 3000 
>>> > `images`. All properly indexed and with sensible column types. 
>>> > 
>>> > On Tuesday, January 22, 2013 

Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Nikolas Stevenson-Molnar
When running locally, are you also using a local database instance? IIRC
the time that phpMyAdmin reports is the time taken for the database to
execute the query, whereas the time reported by the debug toolbar (I
/think/) is complete round-trip (send query, execute, return results).
So if there is latency between your application and your database, that
could account for the discrepancy.

The debug toolbar /does/ add overhead to your request (about doubles the
page load time), but the numbers it reports should be more or less
accurate (iow a page might take 1s to load with the toolbar, but the
toolbar will report 500ms, which is accurate for load-time /without/ the
toolbar).

_Nik

On 1/22/2013 9:26 AM, Matt Andrews wrote:
> Hi Nik,
>
> I see the discrepancy both locally and when deployed, too -- that's
> what's so puzzling. If it helps, I'm using MySQL and running it on
> Windows locally and under Passenger on my remote Linux server (also on
> MySQL).
>
> Only other thing I can think of is that this "overhead" might
> literally be the extra work Django Debug Toolbar does to show queries
> etc. Obviously when I turn the toolbar off then I can't see (directly)
> the SQL times Django is getting, but could be a factor? Either way,
> with the toolbar off on production, there's definitely a slight "lag"
> before page requests start to kick in -- maybe this is unrelated to
> the SQL, though.
>
>
> On Tuesday, 22 January 2013 16:42:19 UTC, Nikolas Stevenson-Molnar wrote:
>
> Hi Matt,
>
> It's unlikely the problem lies with dictfetchall. The "small
> performance hit" mentioned by the docs probably refers to the fact
> that the function has a time complexity of (number of colulmns) *
> (number of rows). Since you're only getting 10 rows back, I can't
> see that having an appreciable impact.
>
> I haven't seen this kind of variation between performing a direct
> query, and executing the query as part of a Django request...
> especially with such a basic query. Do you see the same difference
> in timing locally or is it only when deployed?
>
> _Nik
>
> On Jan 22, 2013, at 4:47 AM, Matt Andrews  > wrote:
>
>> That looks really useful - for future reference, great! Sadly my
>> webhost is still bundling Django 1.2.3 :(
>>
>> In general, though, these issues still come down to the fact that
>> a SQL query as executed by Django takes several times longer than
>> going directly to the DB. Is this down to my use of
>> dictfetchall() or something more low level?
>>
>>
>> On Tuesday, January 22, 2013 12:09:50 PM UTC, Jani Tiainen wrote:
>>
>> Hi,
>>
>> I see that you had quite a bunch of m2m keys.
>>
>> Have you tried recent version of Django (1.4+) with
>> prefetch_related() [1] ?
>>
>>
>> [1]
>> 
>> > 
>> >
>>
>>
>>
>> 22.1.2013 13:57, Matt Andrews kirjoitti:
>> > Hi Jani,
>> >
>> > I made a StackOverflow post last year with an example of
>> the ORM stuff I
>> > tried and the poor queries it produced:
>> >
>> 
>> http://stackoverflow.com/questions/5843457/django-objects-all-making-360-queries-how-can-i-optimise-this-manytomany
>> 
>> 
>>
>> >
>> > There's also this discussion about how using the same
>> queryset in two
>> > places in the template caused Django to request its data
>> twice:
>> >
>> 
>> http://stackoverflow.com/questions/9447053/best-way-to-slice-a-django-queryset-without-hitting-the-database-more-than-once
>> 
>> 
>> (this
>> > was easily fixed, but again, not very intuitive and
>> frustrated me)
>> >
>> > I don't have the data to hand currently but I also remember
>> seeing weird
>> > things happen where queries would end with stuff like "...
>> LIMIT
>> > 234423445" (or some crazy number which I'd never entered
>> and was orders
>> > of magnitude bigger than the number of rows in the table).
>> >
>> > I'm aware these are probably edge cases that are down to my
>> own novice
>> > status, but even using things like select_related(), it
>> wasn't doing
>> > what I wanted. I just felt it easier to use my existing SQL
>> (I'm
>> > converting a PHP app over to Django) and I'm not concerned
>> about
>> > database portability (switching to postgres or whatever).
>>   

Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Matt Andrews
Hi Nik,

I see the discrepancy both locally and when deployed, too -- that's what's 
so puzzling. If it helps, I'm using MySQL and running it on Windows locally 
and under Passenger on my remote Linux server (also on MySQL).

Only other thing I can think of is that this "overhead" might literally be 
the extra work Django Debug Toolbar does to show queries etc. Obviously 
when I turn the toolbar off then I can't see (directly) the SQL times 
Django is getting, but could be a factor? Either way, with the toolbar off 
on production, there's definitely a slight "lag" before page requests start 
to kick in -- maybe this is unrelated to the SQL, though.


On Tuesday, 22 January 2013 16:42:19 UTC, Nikolas Stevenson-Molnar wrote:
>
> Hi Matt,
>
> It's unlikely the problem lies with dictfetchall. The "small performance 
> hit" mentioned by the docs probably refers to the fact that the function 
> has a time complexity of (number of colulmns) * (number of rows). Since 
> you're only getting 10 rows back, I can't see that having an appreciable 
> impact.
>
> I haven't seen this kind of variation between performing a direct query, 
> and executing the query as part of a Django request… especially with such a 
> basic query. Do you see the same difference in timing locally or is it only 
> when deployed?
>
> _Nik
>
> On Jan 22, 2013, at 4:47 AM, Matt Andrews 
>  
> wrote:
>
> That looks really useful - for future reference, great! Sadly my webhost 
> is still bundling Django 1.2.3 :(
>
> In general, though, these issues still come down to the fact that a SQL 
> query as executed by Django takes several times longer than going directly 
> to the DB. Is this down to my use of dictfetchall() or something more low 
> level?
>
>
> On Tuesday, January 22, 2013 12:09:50 PM UTC, Jani Tiainen wrote:
>>
>> Hi, 
>>
>> I see that you had quite a bunch of m2m keys. 
>>
>> Have you tried recent version of Django (1.4+) with prefetch_related() 
>> [1] ? 
>>
>>
>> [1] 
>> <
>> https://docs.djangoproject.com/en/1.4/ref/models/querysets/#prefetch-related>
>>  
>>
>>
>>
>> 22.1.2013 13:57, Matt Andrews kirjoitti: 
>> > Hi Jani, 
>> > 
>> > I made a StackOverflow post last year with an example of the ORM stuff 
>> I 
>> > tried and the poor queries it produced: 
>> > 
>> http://stackoverflow.com/questions/5843457/django-objects-all-making-360-queries-how-can-i-optimise-this-manytomany
>>  
>> > 
>> > There's also this discussion about how using the same queryset in two 
>> > places in the template caused Django to request its data twice: 
>> > 
>> http://stackoverflow.com/questions/9447053/best-way-to-slice-a-django-queryset-without-hitting-the-database-more-than-once(this
>>  
>> > was easily fixed, but again, not very intuitive and frustrated me) 
>> > 
>> > I don't have the data to hand currently but I also remember seeing 
>> weird 
>> > things happen where queries would end with stuff like "... LIMIT 
>> > 234423445" (or some crazy number which I'd never entered and was orders 
>> > of magnitude bigger than the number of rows in the table). 
>> > 
>> > I'm aware these are probably edge cases that are down to my own novice 
>> > status, but even using things like select_related(), it wasn't doing 
>> > what I wanted. I just felt it easier to use my existing SQL (I'm 
>> > converting a PHP app over to Django) and I'm not concerned about 
>> > database portability (switching to postgres or whatever). 
>> > 
>> > Nik: just realised I missed your final question. For the SQL posted 
>> > above, the numbers are approximately: 12,000 rows in the `news` table, 
>> > maybe 10 `news_category` rows, about 100 `writers` and around 3000 
>> > `images`. All properly indexed and with sensible column types. 
>> > 
>> > On Tuesday, January 22, 2013 10:53:40 AM UTC, Jani Tiainen wrote: 
>> > 
>> > Hi, 
>> > 
>> >   From your raw SQL I saw you're doing few joins. So I suppose you 
>> do 
>> > quite a few foreign key fetches. 
>> > 
>> > You didn't mention anything how you originally tried to solve case 
>> with 
>> > ORM. Could you please publish what you had when things were slow? 
>> > 
>> > 22.1.2013 12:26, Matt Andrews kirjoitti: 
>> >  > Hi Nik, 
>> >  > 
>> >  > Thanks - I do feel like by circumventing the ORM I've just 
>> "given 
>> > up" 
>> >  > and perhaps I'll reconsider -- none of my queries are 
>> particularly 
>> >  > "specialist" (as the sample above indicates) - I just found 
>> Django 
>> >  > generating odd things. 
>> >  > 
>> >  > To answer your questions: 
>> >  > 
>> >  > 1. Yes, reloading the page sees the same time for the queries 
>> (it 
>> > just 
>> >  > feels as though the entire process takes a second or two to 
>> > start, which 
>> >  > is perhaps not related to SQL itself). 
>> >  > 
>> >  > 2. I believe so, yes (it's shared hosting...). 
>> >  > 
>> >  > If you're curious, you can see a sample 

Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Nikolas Stevenson-Molnar
Hi Matt,

It's unlikely the problem lies with dictfetchall. The "small performance hit" 
mentioned by the docs probably refers to the fact that the function has a time 
complexity of (number of colulmns) * (number of rows). Since you're only 
getting 10 rows back, I can't see that having an appreciable impact.

I haven't seen this kind of variation between performing a direct query, and 
executing the query as part of a Django request… especially with such a basic 
query. Do you see the same difference in timing locally or is it only when 
deployed?

_Nik

On Jan 22, 2013, at 4:47 AM, Matt Andrews  wrote:

> That looks really useful - for future reference, great! Sadly my webhost is 
> still bundling Django 1.2.3 :(
> 
> In general, though, these issues still come down to the fact that a SQL query 
> as executed by Django takes several times longer than going directly to the 
> DB. Is this down to my use of dictfetchall() or something more low level?
> 
> 
> On Tuesday, January 22, 2013 12:09:50 PM UTC, Jani Tiainen wrote:
> Hi, 
> 
> I see that you had quite a bunch of m2m keys. 
> 
> Have you tried recent version of Django (1.4+) with prefetch_related() [1] ? 
> 
> 
> [1] 
> 
>  
> 
> 
> 22.1.2013 13:57, Matt Andrews kirjoitti: 
> > Hi Jani, 
> > 
> > I made a StackOverflow post last year with an example of the ORM stuff I 
> > tried and the poor queries it produced: 
> > http://stackoverflow.com/questions/5843457/django-objects-all-making-360-queries-how-can-i-optimise-this-manytomany
> >  
> > 
> > There's also this discussion about how using the same queryset in two 
> > places in the template caused Django to request its data twice: 
> > http://stackoverflow.com/questions/9447053/best-way-to-slice-a-django-queryset-without-hitting-the-database-more-than-once
> >  (this 
> > was easily fixed, but again, not very intuitive and frustrated me) 
> > 
> > I don't have the data to hand currently but I also remember seeing weird 
> > things happen where queries would end with stuff like "... LIMIT 
> > 234423445" (or some crazy number which I'd never entered and was orders 
> > of magnitude bigger than the number of rows in the table). 
> > 
> > I'm aware these are probably edge cases that are down to my own novice 
> > status, but even using things like select_related(), it wasn't doing 
> > what I wanted. I just felt it easier to use my existing SQL (I'm 
> > converting a PHP app over to Django) and I'm not concerned about 
> > database portability (switching to postgres or whatever). 
> > 
> > Nik: just realised I missed your final question. For the SQL posted 
> > above, the numbers are approximately: 12,000 rows in the `news` table, 
> > maybe 10 `news_category` rows, about 100 `writers` and around 3000 
> > `images`. All properly indexed and with sensible column types. 
> > 
> > On Tuesday, January 22, 2013 10:53:40 AM UTC, Jani Tiainen wrote: 
> > 
> > Hi, 
> > 
> >   From your raw SQL I saw you're doing few joins. So I suppose you do 
> > quite a few foreign key fetches. 
> > 
> > You didn't mention anything how you originally tried to solve case with 
> > ORM. Could you please publish what you had when things were slow? 
> > 
> > 22.1.2013 12:26, Matt Andrews kirjoitti: 
> >  > Hi Nik, 
> >  > 
> >  > Thanks - I do feel like by circumventing the ORM I've just "given 
> > up" 
> >  > and perhaps I'll reconsider -- none of my queries are particularly 
> >  > "specialist" (as the sample above indicates) - I just found Django 
> >  > generating odd things. 
> >  > 
> >  > To answer your questions: 
> >  > 
> >  > 1. Yes, reloading the page sees the same time for the queries (it 
> > just 
> >  > feels as though the entire process takes a second or two to 
> > start, which 
> >  > is perhaps not related to SQL itself). 
> >  > 
> >  > 2. I believe so, yes (it's shared hosting...). 
> >  > 
> >  > If you're curious, you can see a sample of the app at 
> >  > http://beta.scenepointblank.com  
> > (obviously you won't see the SQL, but 
> >  > the "delay" between pages, even though these pages are all cached 
> > for 
> >  > 2hrs+, is partly my concern here). 
> >  > 
> >  > On Tuesday, January 22, 2013 4:24:09 AM UTC, Nikolas 
> > Stevenson-Molnar wrote: 
> >  > 
> >  > Hi Matt, 
> >  > 
> >  > Firstly, I encourage you to have another crack a the ORM. It can 
> >  > certainly seem a bit aggravating at times if you're coming 
> > from a 
> >  > SQL mindset, but really pays off down the road in terms of 
> >  > maintainability and readability. Typically you should only 
> > need raw 
> >  > queries in Django in cases where you have super-specialized 
> > (that 
> >  > uses 

Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Matt Andrews
That looks really useful - for future reference, great! Sadly my webhost is 
still bundling Django 1.2.3 :(

In general, though, these issues still come down to the fact that a SQL 
query as executed by Django takes several times longer than going directly 
to the DB. Is this down to my use of dictfetchall() or something more low 
level?


On Tuesday, January 22, 2013 12:09:50 PM UTC, Jani Tiainen wrote:
>
> Hi, 
>
> I see that you had quite a bunch of m2m keys. 
>
> Have you tried recent version of Django (1.4+) with prefetch_related() [1] 
> ? 
>
>
> [1] 
> <
> https://docs.djangoproject.com/en/1.4/ref/models/querysets/#prefetch-related> 
>
>
>
> 22.1.2013 13:57, Matt Andrews kirjoitti: 
> > Hi Jani, 
> > 
> > I made a StackOverflow post last year with an example of the ORM stuff I 
> > tried and the poor queries it produced: 
> > 
> http://stackoverflow.com/questions/5843457/django-objects-all-making-360-queries-how-can-i-optimise-this-manytomany
>  
> > 
> > There's also this discussion about how using the same queryset in two 
> > places in the template caused Django to request its data twice: 
> > 
> http://stackoverflow.com/questions/9447053/best-way-to-slice-a-django-queryset-without-hitting-the-database-more-than-once(this
>  
> > was easily fixed, but again, not very intuitive and frustrated me) 
> > 
> > I don't have the data to hand currently but I also remember seeing weird 
> > things happen where queries would end with stuff like "... LIMIT 
> > 234423445" (or some crazy number which I'd never entered and was orders 
> > of magnitude bigger than the number of rows in the table). 
> > 
> > I'm aware these are probably edge cases that are down to my own novice 
> > status, but even using things like select_related(), it wasn't doing 
> > what I wanted. I just felt it easier to use my existing SQL (I'm 
> > converting a PHP app over to Django) and I'm not concerned about 
> > database portability (switching to postgres or whatever). 
> > 
> > Nik: just realised I missed your final question. For the SQL posted 
> > above, the numbers are approximately: 12,000 rows in the `news` table, 
> > maybe 10 `news_category` rows, about 100 `writers` and around 3000 
> > `images`. All properly indexed and with sensible column types. 
> > 
> > On Tuesday, January 22, 2013 10:53:40 AM UTC, Jani Tiainen wrote: 
> > 
> > Hi, 
> > 
> >   From your raw SQL I saw you're doing few joins. So I suppose you 
> do 
> > quite a few foreign key fetches. 
> > 
> > You didn't mention anything how you originally tried to solve case 
> with 
> > ORM. Could you please publish what you had when things were slow? 
> > 
> > 22.1.2013 12:26, Matt Andrews kirjoitti: 
> >  > Hi Nik, 
> >  > 
> >  > Thanks - I do feel like by circumventing the ORM I've just "given 
> > up" 
> >  > and perhaps I'll reconsider -- none of my queries are 
> particularly 
> >  > "specialist" (as the sample above indicates) - I just found 
> Django 
> >  > generating odd things. 
> >  > 
> >  > To answer your questions: 
> >  > 
> >  > 1. Yes, reloading the page sees the same time for the queries (it 
> > just 
> >  > feels as though the entire process takes a second or two to 
> > start, which 
> >  > is perhaps not related to SQL itself). 
> >  > 
> >  > 2. I believe so, yes (it's shared hosting...). 
> >  > 
> >  > If you're curious, you can see a sample of the app at 
> >  > http://beta.scenepointblank.com  
>
> > (obviously you won't see the SQL, but 
> >  > the "delay" between pages, even though these pages are all cached 
> > for 
> >  > 2hrs+, is partly my concern here). 
> >  > 
> >  > On Tuesday, January 22, 2013 4:24:09 AM UTC, Nikolas 
> > Stevenson-Molnar wrote: 
> >  > 
> >  > Hi Matt, 
> >  > 
> >  > Firstly, I encourage you to have another crack a the ORM. It 
> can 
> >  > certainly seem a bit aggravating at times if you're coming 
> > from a 
> >  > SQL mindset, but really pays off down the road in terms of 
> >  > maintainability and readability. Typically you should only 
> > need raw 
> >  > queries in Django in cases where you have super-specialized 
> > (that 
> >  > uses views or non-standard functions) queries or need some 
> > specific 
> >  > optimization. If there's really no way to perform many of 
> your 
> >  > "day-to-day" queries with the ORM then that's an indication 
> > that a 
> >  > different database design may fit your data model better. I 
> >  > understand that you may have a unique situation, but I just 
> > wanted 
> >  > to throw that out there as I personally find the ORM to be a 
> > huge 
> >  > time saver. 
> >  > 
> >  > Now, with that out of the way... a couple of considerations: 
> > 1) 

Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Jani Tiainen

Hi,

I see that you had quite a bunch of m2m keys.

Have you tried recent version of Django (1.4+) with prefetch_related() [1] ?


[1] 




22.1.2013 13:57, Matt Andrews kirjoitti:

Hi Jani,

I made a StackOverflow post last year with an example of the ORM stuff I
tried and the poor queries it produced:
http://stackoverflow.com/questions/5843457/django-objects-all-making-360-queries-how-can-i-optimise-this-manytomany

There's also this discussion about how using the same queryset in two
places in the template caused Django to request its data twice:
http://stackoverflow.com/questions/9447053/best-way-to-slice-a-django-queryset-without-hitting-the-database-more-than-once
 (this
was easily fixed, but again, not very intuitive and frustrated me)

I don't have the data to hand currently but I also remember seeing weird
things happen where queries would end with stuff like "... LIMIT
234423445" (or some crazy number which I'd never entered and was orders
of magnitude bigger than the number of rows in the table).

I'm aware these are probably edge cases that are down to my own novice
status, but even using things like select_related(), it wasn't doing
what I wanted. I just felt it easier to use my existing SQL (I'm
converting a PHP app over to Django) and I'm not concerned about
database portability (switching to postgres or whatever).

Nik: just realised I missed your final question. For the SQL posted
above, the numbers are approximately: 12,000 rows in the `news` table,
maybe 10 `news_category` rows, about 100 `writers` and around 3000
`images`. All properly indexed and with sensible column types.

On Tuesday, January 22, 2013 10:53:40 AM UTC, Jani Tiainen wrote:

Hi,

  From your raw SQL I saw you're doing few joins. So I suppose you do
quite a few foreign key fetches.

You didn't mention anything how you originally tried to solve case with
ORM. Could you please publish what you had when things were slow?

22.1.2013 12:26, Matt Andrews kirjoitti:
 > Hi Nik,
 >
 > Thanks - I do feel like by circumventing the ORM I've just "given
up"
 > and perhaps I'll reconsider -- none of my queries are particularly
 > "specialist" (as the sample above indicates) - I just found Django
 > generating odd things.
 >
 > To answer your questions:
 >
 > 1. Yes, reloading the page sees the same time for the queries (it
just
 > feels as though the entire process takes a second or two to
start, which
 > is perhaps not related to SQL itself).
 >
 > 2. I believe so, yes (it's shared hosting...).
 >
 > If you're curious, you can see a sample of the app at
 > http://beta.scenepointblank.com 
(obviously you won't see the SQL, but
 > the "delay" between pages, even though these pages are all cached
for
 > 2hrs+, is partly my concern here).
 >
 > On Tuesday, January 22, 2013 4:24:09 AM UTC, Nikolas
Stevenson-Molnar wrote:
 >
 > Hi Matt,
 >
 > Firstly, I encourage you to have another crack a the ORM. It can
 > certainly seem a bit aggravating at times if you're coming
from a
 > SQL mindset, but really pays off down the road in terms of
 > maintainability and readability. Typically you should only
need raw
 > queries in Django in cases where you have super-specialized
(that
 > uses views or non-standard functions) queries or need some
specific
 > optimization. If there's really no way to perform many of your
 > "day-to-day" queries with the ORM then that's an indication
that a
 > different database design may fit your data model better. I
 > understand that you may have a unique situation, but I just
wanted
 > to throw that out there as I personally find the ORM to be a
huge
 > time saver.
 >
 > Now, with that out of the way... a couple of considerations:
1) you
 > say it's a slow "startup"; if you refresh the page do the
queries
 > run just as slow the second time around? and 2) are your
Django app
 > and phpMyAdmin running on the same machine? If not, could
transit
 > time be an issue?
 >
 > Finally, can you give an idea about the size of the tables in
 > question? How many rows in each?
 >
 > _Nik
 >
 > On 1/21/2013 3:25 PM, Matt Andrews wrote:
 >> Hi all,
 >>
 >> Fairly new to Django. I ended up pulling out all of the
 >> ORM-generated queries and writing my own SQL directly (I got
fed
 >> up trying to work out how to achieve the kind of things I
needed
 >> without Django adding in extra joins or unintended WHERE
clauses
 >> etc). All my app's SQL uses cursor.execute() and the
 >> 

Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Matt Andrews
Hi Jani,

I made a StackOverflow post last year with an example of the ORM stuff I 
tried and the poor queries it produced: 
http://stackoverflow.com/questions/5843457/django-objects-all-making-360-queries-how-can-i-optimise-this-manytomany

There's also this discussion about how using the same queryset in two 
places in the template caused Django to request its data twice: 
http://stackoverflow.com/questions/9447053/best-way-to-slice-a-django-queryset-without-hitting-the-database-more-than-once
 (this 
was easily fixed, but again, not very intuitive and frustrated me)

I don't have the data to hand currently but I also remember seeing weird 
things happen where queries would end with stuff like "... LIMIT 234423445" 
(or some crazy number which I'd never entered and was orders of magnitude 
bigger than the number of rows in the table).

I'm aware these are probably edge cases that are down to my own novice 
status, but even using things like select_related(), it wasn't doing what I 
wanted. I just felt it easier to use my existing SQL (I'm converting a PHP 
app over to Django) and I'm not concerned about database portability 
(switching to postgres or whatever). 

Nik: just realised I missed your final question. For the SQL posted above, 
the numbers are approximately: 12,000 rows in the `news` table, maybe 10 
`news_category` rows, about 100 `writers` and around 3000 `images`. All 
properly indexed and with sensible column types. 

On Tuesday, January 22, 2013 10:53:40 AM UTC, Jani Tiainen wrote:
>
> Hi, 
>
>  From your raw SQL I saw you're doing few joins. So I suppose you do 
> quite a few foreign key fetches. 
>
> You didn't mention anything how you originally tried to solve case with 
> ORM. Could you please publish what you had when things were slow? 
>
> 22.1.2013 12:26, Matt Andrews kirjoitti: 
> > Hi Nik, 
> > 
> > Thanks - I do feel like by circumventing the ORM I've just "given up" 
> > and perhaps I'll reconsider -- none of my queries are particularly 
> > "specialist" (as the sample above indicates) - I just found Django 
> > generating odd things. 
> > 
> > To answer your questions: 
> > 
> > 1. Yes, reloading the page sees the same time for the queries (it just 
> > feels as though the entire process takes a second or two to start, which 
> > is perhaps not related to SQL itself). 
> > 
> > 2. I believe so, yes (it's shared hosting...). 
> > 
> > If you're curious, you can see a sample of the app at 
> > http://beta.scenepointblank.com (obviously you won't see the SQL, but 
> > the "delay" between pages, even though these pages are all cached for 
> > 2hrs+, is partly my concern here). 
> > 
> > On Tuesday, January 22, 2013 4:24:09 AM UTC, Nikolas Stevenson-Molnar 
> wrote: 
> > 
> > Hi Matt, 
> > 
> > Firstly, I encourage you to have another crack a the ORM. It can 
> > certainly seem a bit aggravating at times if you're coming from a 
> > SQL mindset, but really pays off down the road in terms of 
> > maintainability and readability. Typically you should only need raw 
> > queries in Django in cases where you have super-specialized (that 
> > uses views or non-standard functions) queries or need some specific 
> > optimization. If there's really no way to perform many of your 
> > "day-to-day" queries with the ORM then that's an indication that a 
> > different database design may fit your data model better. I 
> > understand that you may have a unique situation, but I just wanted 
> > to throw that out there as I personally find the ORM to be a huge 
> > time saver. 
> > 
> > Now, with that out of the way... a couple of considerations: 1) you 
> > say it's a slow "startup"; if you refresh the page do the queries 
> > run just as slow the second time around? and 2) are your Django app 
> > and phpMyAdmin running on the same machine? If not, could transit 
> > time be an issue? 
> > 
> > Finally, can you give an idea about the size of the tables in 
> > question? How many rows in each? 
> > 
> > _Nik 
> > 
> > On 1/21/2013 3:25 PM, Matt Andrews wrote: 
> >> Hi all, 
> >> 
> >> Fairly new to Django. I ended up pulling out all of the 
> >> ORM-generated queries and writing my own SQL directly (I got fed 
> >> up trying to work out how to achieve the kind of things I needed 
> >> without Django adding in extra joins or unintended WHERE clauses 
> >> etc). All my app's SQL uses cursor.execute() and the 
> >> dictfetchall() method as referenced here 
> >> <
> https://docs.djangoproject.com/en/dev/topics/db/sql/#django.db.models.Manager.raw>.
>  
>
> >> 
> >> 
> >> I've found that my app incurs a couple of seconds load time in 
> >> production, with CPU time at 2532ms and overall time 4684ms 
> >> (according to the debug toolbar). I'm seeing 8 SQL queries take 
> >> 380ms, and each one seems to be several times slower when made by 
> >> Django versus 

Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Jani Tiainen

Hi,

From your raw SQL I saw you're doing few joins. So I suppose you do 
quite a few foreign key fetches.


You didn't mention anything how you originally tried to solve case with 
ORM. Could you please publish what you had when things were slow?


22.1.2013 12:26, Matt Andrews kirjoitti:

Hi Nik,

Thanks - I do feel like by circumventing the ORM I've just "given up"
and perhaps I'll reconsider -- none of my queries are particularly
"specialist" (as the sample above indicates) - I just found Django
generating odd things.

To answer your questions:

1. Yes, reloading the page sees the same time for the queries (it just
feels as though the entire process takes a second or two to start, which
is perhaps not related to SQL itself).

2. I believe so, yes (it's shared hosting...).

If you're curious, you can see a sample of the app at
http://beta.scenepointblank.com (obviously you won't see the SQL, but
the "delay" between pages, even though these pages are all cached for
2hrs+, is partly my concern here).

On Tuesday, January 22, 2013 4:24:09 AM UTC, Nikolas Stevenson-Molnar wrote:

Hi Matt,

Firstly, I encourage you to have another crack a the ORM. It can
certainly seem a bit aggravating at times if you're coming from a
SQL mindset, but really pays off down the road in terms of
maintainability and readability. Typically you should only need raw
queries in Django in cases where you have super-specialized (that
uses views or non-standard functions) queries or need some specific
optimization. If there's really no way to perform many of your
"day-to-day" queries with the ORM then that's an indication that a
different database design may fit your data model better. I
understand that you may have a unique situation, but I just wanted
to throw that out there as I personally find the ORM to be a huge
time saver.

Now, with that out of the way... a couple of considerations: 1) you
say it's a slow "startup"; if you refresh the page do the queries
run just as slow the second time around? and 2) are your Django app
and phpMyAdmin running on the same machine? If not, could transit
time be an issue?

Finally, can you give an idea about the size of the tables in
question? How many rows in each?

_Nik

On 1/21/2013 3:25 PM, Matt Andrews wrote:

Hi all,

Fairly new to Django. I ended up pulling out all of the
ORM-generated queries and writing my own SQL directly (I got fed
up trying to work out how to achieve the kind of things I needed
without Django adding in extra joins or unintended WHERE clauses
etc). All my app's SQL uses cursor.execute() and the
dictfetchall() method as referenced here

.


I've found that my app incurs a couple of seconds load time in
production, with CPU time at 2532ms and overall time 4684ms
(according to the debug toolbar). I'm seeing 8 SQL queries take
380ms, and each one seems to be several times slower when made by
Django versus hitting the database through phpMyAdmin or
something: eg, this query:

SELECT * FROM news
JOIN news_categories ON news.news_category_id =
news_categories.id 
LEFT JOIN writers ON news.writer_id = writers.id

LEFT JOIN images ON news.image_id = images.id 
ORDER BY news.is_sticky DESC, news.date_posted DESC
LIMIT 10


This takes 14.8ms when run in phpMyAdmin (against the production
database) but Django reports it as 85.2ms. The same ratios are
true for all my other queries.

All I can think of is the note on the dictfetchall() method in the
Django docs which describes a "small performance hit". Is this it?!

I've profiled the app too, although I'm a bit hazy about what it
all means. Here's a dump of the result:
http://pastebin.com/raw.php?i=UHE9edVC
 (this is from running on
my local server rather than production but performance is broadly
similar).

Can anyone help me? I realise I've perhaps gone off-piste by
writing raw SQL but I feel it was justified.

thanks,
Matt



--
Jani Tiainen

- Well planned is half done and a half done has been sufficient before...

--
You received this message because you are subscribed to the Google Groups "Django 
users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-22 Thread Matt Andrews
Hi Nik,

Thanks - I do feel like by circumventing the ORM I've just "given up" and 
perhaps I'll reconsider -- none of my queries are particularly "specialist" 
(as the sample above indicates) - I just found Django generating odd things.

To answer your questions:

1. Yes, reloading the page sees the same time for the queries (it just 
feels as though the entire process takes a second or two to start, which is 
perhaps not related to SQL itself).

2. I believe so, yes (it's shared hosting...).

If you're curious, you can see a sample of the app at 
http://beta.scenepointblank.com (obviously you won't see the SQL, but the 
"delay" between pages, even though these pages are all cached for 2hrs+, is 
partly my concern here).

On Tuesday, January 22, 2013 4:24:09 AM UTC, Nikolas Stevenson-Molnar wrote:
>
>  Hi Matt,
>
> Firstly, I encourage you to have another crack a the ORM. It can certainly 
> seem a bit aggravating at times if you're coming from a SQL mindset, but 
> really pays off down the road in terms of maintainability and readability. 
> Typically you should only need raw queries in Django in cases where you 
> have super-specialized (that uses views or non-standard functions) queries 
> or need some specific optimization. If there's really no way to perform 
> many of your "day-to-day" queries with the ORM then that's an indication 
> that a different database design may fit your data model better. I 
> understand that you may have a unique situation, but I just wanted to throw 
> that out there as I personally find the ORM to be a huge time saver.
>
> Now, with that out of the way... a couple of considerations: 1) you say 
> it's a slow "startup"; if you refresh the page do the queries run just as 
> slow the second time around? and 2) are your Django app and phpMyAdmin 
> running on the same machine? If not, could transit time be an issue?
>
> Finally, can you give an idea about the size of the tables in question? 
> How many rows in each?
>
> _Nik
>
> On 1/21/2013 3:25 PM, Matt Andrews wrote:
>  
> Hi all, 
>
>  Fairly new to Django. I ended up pulling out all of the ORM-generated 
> queries and writing my own SQL directly (I got fed up trying to work out 
> how to achieve the kind of things I needed without Django adding in extra 
> joins or unintended WHERE clauses etc). All my app's SQL uses 
> cursor.execute() and the dictfetchall() method as referenced 
> here
> . 
>
>  I've found that my app incurs a couple of seconds load time in 
> production, with CPU time at 2532ms and overall time 4684ms (according to 
> the debug toolbar). I'm seeing 8 SQL queries take 380ms, and each one seems 
> to be several times slower when made by Django versus hitting the database 
> through phpMyAdmin or something: eg, this query:
>
>  SELECT * FROM news 
> JOIN news_categories ON news.news_category_id = news_categories.id 
> LEFT JOIN writers ON news.writer_id = writers.id 
> LEFT JOIN images ON news.image_id = images.id 
> ORDER BY news.is_sticky DESC, news.date_posted DESC 
> LIMIT 10
>
>
>  This takes 14.8ms when run in phpMyAdmin (against the production 
> database) but Django reports it as 85.2ms. The same ratios are true for all 
> my other queries.
>
>  All I can think of is the note on the dictfetchall() method in the 
> Django docs which describes a "small performance hit". Is this it?!
>
>  I've profiled the app too, although I'm a bit hazy about what it all 
> means. Here's a dump of the result: 
> http://pastebin.com/raw.php?i=UHE9edVC(this is from running on my local 
> server rather than production but 
> performance is broadly similar).
>
>  Can anyone help me? I realise I've perhaps gone off-piste by writing raw 
> SQL but I feel it was justified.
>
>  thanks,
> Matt
>
>  
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "Django users" group.
> To view this discussion on the web visit 
> https://groups.google.com/d/msg/django-users/-/Qiley9RqfngJ.
> To post to this group, send email to django...@googlegroups.com
> .
> To unsubscribe from this group, send email to 
> django-users...@googlegroups.com .
> For more options, visit this group at 
> http://groups.google.com/group/django-users?hl=en.
>
>
>  

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-users/-/mRrUOi_UcswJ.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-21 Thread Nikolas Stevenson-Molnar
Hi Matt,

Firstly, I encourage you to have another crack a the ORM. It can
certainly seem a bit aggravating at times if you're coming from a SQL
mindset, but really pays off down the road in terms of maintainability
and readability. Typically you should only need raw queries in Django in
cases where you have super-specialized (that uses views or non-standard
functions) queries or need some specific optimization. If there's really
no way to perform many of your "day-to-day" queries with the ORM then
that's an indication that a different database design may fit your data
model better. I understand that you may have a unique situation, but I
just wanted to throw that out there as I personally find the ORM to be a
huge time saver.

Now, with that out of the way... a couple of considerations: 1) you say
it's a slow "startup"; if you refresh the page do the queries run just
as slow the second time around? and 2) are your Django app and
phpMyAdmin running on the same machine? If not, could transit time be an
issue?

Finally, can you give an idea about the size of the tables in question?
How many rows in each?

_Nik

On 1/21/2013 3:25 PM, Matt Andrews wrote:
> Hi all,
>
> Fairly new to Django. I ended up pulling out all of the ORM-generated
> queries and writing my own SQL directly (I got fed up trying to work
> out how to achieve the kind of things I needed without Django adding
> in extra joins or unintended WHERE clauses etc). All my app's SQL uses
> cursor.execute() and the dictfetchall() method as referenced here
> .
>  
>
> I've found that my app incurs a couple of seconds load time in
> production, with CPU time at 2532ms and overall time 4684ms (according
> to the debug toolbar). I'm seeing 8 SQL queries take 380ms, and each
> one seems to be several times slower when made by Django versus
> hitting the database through phpMyAdmin or something: eg, this query:
>
> SELECT * FROM news 
> JOIN news_categories ON news.news_category_id = news_categories.id 
> LEFT JOIN writers ON news.writer_id = writers.id 
> LEFT JOIN images ON news.image_id = images.id 
> ORDER BY news.is_sticky DESC, news.date_posted DESC 
> LIMIT 10
>
>
> This takes 14.8ms when run in phpMyAdmin (against the production
> database) but Django reports it as 85.2ms. The same ratios are true
> for all my other queries.
>
> All I can think of is the note on the dictfetchall() method in the
> Django docs which describes a "small performance hit". Is this it?!
>
> I've profiled the app too, although I'm a bit hazy about what it all
> means. Here's a dump of the
> result: http://pastebin.com/raw.php?i=UHE9edVC (this is from running
> on my local server rather than production but performance is broadly
> similar).
>
> Can anyone help me? I realise I've perhaps gone off-piste by writing
> raw SQL but I feel it was justified.
>
> thanks,
> Matt
>
>
> -- 
> You received this message because you are subscribed to the Google
> Groups "Django users" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/django-users/-/Qiley9RqfngJ.
> To post to this group, send email to django-users@googlegroups.com.
> To unsubscribe from this group, send email to
> django-users+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/django-users?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Struggling with slow startup / SQL in new app (using raw SQL)

2013-01-21 Thread Matt Andrews
Hi all,

Fairly new to Django. I ended up pulling out all of the ORM-generated 
queries and writing my own SQL directly (I got fed up trying to work out 
how to achieve the kind of things I needed without Django adding in extra 
joins or unintended WHERE clauses etc). All my app's SQL uses 
cursor.execute() and the dictfetchall() method as referenced 
here
. 

I've found that my app incurs a couple of seconds load time in production, 
with CPU time at 2532ms and overall time 4684ms (according to the debug 
toolbar). I'm seeing 8 SQL queries take 380ms, and each one seems to be 
several times slower when made by Django versus hitting the database 
through phpMyAdmin or something: eg, this query:

SELECT * FROM news 
JOIN news_categories ON news.news_category_id = news_categories.id 
LEFT JOIN writers ON news.writer_id = writers.id 
LEFT JOIN images ON news.image_id = images.id 
ORDER BY news.is_sticky DESC, news.date_posted DESC 
LIMIT 10


This takes 14.8ms when run in phpMyAdmin (against the production database) 
but Django reports it as 85.2ms. The same ratios are true for all my other 
queries.

All I can think of is the note on the dictfetchall() method in the Django 
docs which describes a "small performance hit". Is this it?!

I've profiled the app too, although I'm a bit hazy about what it all means. 
Here's a dump of the result: http://pastebin.com/raw.php?i=UHE9edVC (this 
is from running on my local server rather than production but performance 
is broadly similar).

Can anyone help me? I realise I've perhaps gone off-piste by writing raw 
SQL but I feel it was justified.

thanks,
Matt


-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-users/-/Qiley9RqfngJ.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.