On 01/05/2014 11:12 AM, William Becker wrote:
I just tried changing my code base from using hard coded queries to prepared 
statements. As part of this I pass a parameter as the argument to the 
ST_GeographyFromText() function. You can see the code here (1).

After doing this, it took ~20 seconds to run, when previously it took ~2 
seconds. I thought this might be because it wasn't using an index anymore, but 
I checked this out - when it was previously taking 2 seconds, it also ran 
without using the index. For some reason it's using it on my dev machine, but 
this has a smaller dataset - I'll try tuning this but it does not seem to be 
the cause of the issue.

I "improved" this by templatising the query, so that a new query is always 
generated with a different hard coded geography, and this speeds things up - code is here 
(2). However, it now needs to create a separate query for each execution, thus it also 
needs to generate a new plan every time, which is not ideal.

I have attached query plans for the slow run (3) and the sped up run (4). (I 
apologise that the queries aren't exactly the same in each case, but the 
timings are pretty much the same regardless of what the other parameters are).

You can also see the create statement for the database here: (5)

I'm not entirely sure if the problem lies with postgis or postgres in general, 
but I thought I'd present it to you nice people first, since you are probably 
in a better position to decide! This is on postgis 2.0 with postgres 9.2. I 
just upgraded to 9.3 and 2.1 but I need to re-process all my data (which will 
take overnight) before I can do anything.

My guess is that it is evalutating the parameter each time it applies the where 
in the query and there is something slow about this. I'd poke around the 
relavant code if someone could give me some pointers about where to look!

I've seem some other threads that seem to be related:
http://postgresql.1045698.n5.nabble.com/avoid-prepared-statements-on-complex-queries-td4996363.html
http://grokbase.com/t/postgresql/pgsql-performance/127rp189dw/geoserver-postgis-performance-problems

but it seems that I have isolated it just down to that scan taking a long time 
with the prepared statement, but quick without it?

Cheers,
Will

(1) 
https://github.com/twistedvisions/anaximander/blob/c0a7c89e438bcf2504ed8ae3d5f75828e11c2ec9/db_templates/get_event_attendees_in_area.sql

(2) 
https://github.com/twistedvisions/anaximander/blob/12e65537f1d3887677b592d1f0ac0203f2da7d39/db_templates/get_event_attendees_in_area.sql

(3) http://explain.depesz.com/s/kb1A

(4) http://explain.depesz.com/s/G9Wj

(5) 
https://github.com/twistedvisions/anaximander/blob/f6ab21ebff4cf3541bfb53715addb4b3181d5d2e/db/create.sql



It looks like a bunch of row estimates are off.  Have you run analyze on the db?

a new query is always generated with a different hard coded geography.  
However, it now needs to create a separate query for each execution, thus it 
also needs to generate a new plan every time, which is not ideal.

Actually, creating a plan can use information about the data to make a better plan.  That's why using a 
$1 parameter may not give optimal plans.  When you say "where id < 5", the planer knows the 
distribution of data, so it knows about the probability's of 5.  But when you say "where id < 
$1", it has to just use a generic plan.

As long as you dont have your stats target too high, plan time is very fast, 
and compared to a bad plan, could easily be worth paying the price to re-plan 
on every execute.

Any reason you are using HASH indexes?  They usually aren't better.  Have you 
tried it with btree?

Hum?  As a simple test, does this use an index:

select count(*) from place
where 
st_covers('0103000020E610000001000000050000000000D33CB60D9C3FB0941EB661BD49400000D33CB60D9C3FFF56F03723BA494000002DC349F290BFFF56F03723BA494000002DC349F290BFB0941EB661BD49400000D33CB60D9C3FB0941EB661BD4940'::geography,
 location)

After you analyze does it?

Does this:

prepare x as select count(*) from place where st_covers($1::geography, 
location);
explain analyze execute x('01030000....');
deallocate x;


-Andy

_______________________________________________
postgis-users mailing list
[email protected]
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users

Reply via email to