This is a reply to to Andreas's post on the #13495 documentation thread in 
-bugs. 
I am responding to it here because it relates to #13493 only.

Andres wrote, re: #13493

>> This issue is absolutely critical for performance and scalability of code,

> Pft. In most cases it doesn't actually matter that much because the
> contained query are the expensive stuff. It's just when you do lots of
> very short and cheap things that it has such a big effect.  Usually the
> effect on the planner is bigger.

Hi Andres,

'Pft' is kinda rude - I wouldn't comment on it normally,  but seeing as you 
just lectured me on -performance on something you perceived as impolite (just 
like you lectured me on not spreading things onto multiple threads), can you 
please try to set a good example? You don't encourage new contributors into 
open source communities this way. 

Getting to the point. I think the gap between our viewpoints comes from the 
fact I (and others here at my institute) have a bunch of pl/pgsql code here 
with for loops and calculations, which we see as 'code'. Thinking of all the 
users I know myself, I know there are plenty of GIS people out there using for 
loops and pgsql to simulate models on data in the DB, and I expect the same is 
true among e.g. older scientists with DB datasets. 

Whereas it sounds like you and Tom see pl/pgsql as 'glue' and don't see any 
problem. As I have never seen statistics on pl/pgsql use-cases among users at 
large, I don't know what happens everywhere else outside of GIS-world and 
pgdev-world. Have you any references/data you can share on that? I would be 
interested to know because I don't want to overclaim on the importance of these 
bugs or any other bugs in future. In this case, #13493 wrecked the code for 
estimates on a 20 million euro national roadbuilding project here and it cost 
me a few weeks of my life, but for all I know you're totally right about the 
general importance to the world at large.

Though keep in mind: This isn't just only about scaling up one program. It's a 
db-level problem. If you have a large GIS DB server with many users, 
long-running queries etc. on large amounts of data, then you only need e.g. 2-3 
people to be running some code with for-loops or a long series of calculation 
in pl/pgsql, and everything will fall apart in pgsql-land. 

Last point. When I wrote 'absolutely critical' I was under the impression this 
bug could have some serious impact on postgis/pgrouting. Since I wanted to 
double check what you said about 'expensive stuff' vs 'short/cheap stuff', I 
ran some benchmarks to check on a few functions. 

You are right that only short, looped things are affected. e.g. for loops with 
calculations and so on. Didn't see any trouble with the calls I made to postgis 
inside or outside of pgsql. This confirms/replicates your findings. Updated 
numbers/tests posted to github shortly.

Regards

Graeme Bell

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to