Re: [PERFORM] Hmmm... why does pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Graeme B. Bell
On 07 Jul 2015, at 22:52, Merlin Moncure mmonc...@gmail.com wrote:

 On Tue, Jul 7, 2015 at 3:33 PM, Graeme B. Bell graeme.b...@nibio.no wrote:
 
 Hi Merlin,
 
 Long story short - thanks for the reply, but you're not measuring anything 
 about the parallelism of code running in a pl/pgsql environment here. You're 
 just measuring whether postgres can parallelise entering that environment 
 and get back out. Don't get me wrong - it's great that this scales well 
 because it affects situations where you have lots of calls to trivial 
 functions.
 However it's not the problem I'm talking about. I mean 'real' pl'pgsql 
 functions. e.g. things that you might find in postgis or similar.
 
 Maybe so.  But it will be a lot easier for me (and others on this)
 list if you submit a self contained test case that runs via pgbench.


Hi Merlin, 

I'm guessing you are maybe pressed for time at the moment because I already 
clearly included this on the last email, as well as the links to the 
alternative benchmarks with the same problem I referred to on both of my last 
emails which are also trivial to drop into pgbench (cut/paste). 

e.g. did you see these parts of my previous email 

To clear up the issue I build a little test harness around your comment below.
http://github.com/gbb/t;

Just pick any function you like, there are 6 there, and 3 of them demonstrate 2 
different problems, all of it is clearly documented. 

I haven't used perf with pgbench before, and I can't run any code today. 
If you're interested in this but short on time, maybe you can glance at the 
repo above and just add 'perf' at the appropriate point in the rbuild wrapper.

Graeme. 



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Graeme B. Bell
 On 07/07/2015 08:05 PM, Craig James wrote:
 
 
 No ideas, but I ran into the same thing. I have a set of C/C++ functions
 that put some chemistry calculations into Postgres as extensions (things
 like, calculate the molecular weight of this molecule). As SQL
 functions, the whole thing bogged down, and we never got the scalability
 we needed. On our 8-CPU setup, we couldn't get more than 2 CPUs busy at
 the same time, even with dozens of clients.


Hi all,

The sample code / results were put up last night at http://github.com/gbb/t

Craig's problem sounds similar to my own, assuming he means running C 
indirectly via SQL vs running C more directly.
Lots of parallel connections to postgres but maximum 2 CPUs of scaling (and it 
gets worse, as you try to run more things).

Tom Lane has posted an interesting comment over on the bugs list which identies 
a likely source at least one of the problems, maybe both. 
It seems to be linked to internal locking inside postgres (which makes sense, 
given the results - both problems feel 'lock-y').
Also, he mentions a workaround for some functions that scales to 8-way 
apparently. 

http://www.postgresql.org/message-id/31265.1436317...@sss.pgh.pa.us

I think it's potentially a big problem for CPU intensive postgres libraries 
like pgrouting, or perhaps the postgis  postgis raster functions, things like 
that.
I don't know how well their functions are marked for e.g. immutability. 
Are there any postgis devs on this list?

Graeme Bell



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hmmm... why does pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Andres Freund
On 2015-07-08 11:13:04 +, Graeme B. Bell wrote:
 I'm guessing you are maybe pressed for time at the moment because I
 already clearly included this on the last email, as well as the links
 to the alternative benchmarks with the same problem I referred to on
 both of my last emails which are also trivial to drop into pgbench
 (cut/paste).

You realize that you want something here, not Merlin, right?

 e.g. did you see these parts of my previous email 
 
 To clear up the issue I build a little test harness around your comment 
 below.
 http://github.com/gbb/t;

Well, that requires reviewing the source code of the run script and
such.



I think we shouldn't discuss this on two threads (-performance, -bugs),
that makes it hard to follow. Given Tom's more detailed answer I think
the -bugs thread already contains more pertinent information.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] wildcard text filter switched to boolean column, performance is way worse

2015-07-08 Thread Marc Mamin
Hello,
 
 
 
 
 
 From: pgsql-performance-ow...@postgresql.org 
 [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Mike Broers
 Sent: Dienstag, 7. Juli 2015 18:28
 To: Tom Lane
 Cc: pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] wildcard text filter switched to boolean column, 
 performance is way worse
 
 After bumping up work_mem from 12MB to 25MB that last materialize is indeed 
 hashing and this cut the query time by about 60%.  Thanks, this was very 
 helpful and gives me something else to look for when troubleshooting 
 explains.  
 
 
 
 On Tue, Jul 7, 2015 at 11:10 AM, Mike Broers mbro...@gmail.com wrote:
 Thanks, very informative! I'll experiment with work_mem settings and report 
 back.
 
 On Tue, Jul 7, 2015 at 11:02 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Mike Broers mbro...@gmail.com writes:
  I had a query that was filtering with a wildcard search of a text field for
  %SUCCESS%. The query took about 5 seconds and was running often so I wanted
  to improve it.  I suggested that the engineers include a new boolean column
  for successful status.  They implemented the requested field, but the query
  that filters on that new column runs very long (i kill it after letting it
  run for about an hour).  Can someone help me understand why that is the
  case and how to resolve it?
 
 It's hashing the subplan output in the first case and not the second:
 
  Seq Scan on lead  (cost=130951.81..158059.21 rows=139957 width=369) (actual
  time=4699.619..4699.869 rows=1 loops=1)
  Filter: ((NOT (hashed SubPlan 1)) AND ((ReferenceNumber)::text 
  ''::text) AND ((NOT (hashed SubPlan 2)) OR (NOT (hashed SubPlan 3

 vs
 
Seq Scan on lead  (cost=85775.78..9005687281.12 rows=139957 width=369)
  Filter: ((NOT (hashed SubPlan 1)) AND ((ReferenceNumber)::text 
  ''::text) AND ((NOT (hashed SubPlan 2)) OR (NOT (SubPlan 3
^
 
 Presumably, the new more-accurate rows count causes the planner to realize
 that the hash table will exceed work_mem so it doesn't choose to hash ...
 but for your situation, you'd rather it did, because what you're getting
 instead is a Materialize node that spills to disk (again, because the data
 involved exceeds work_mem) and that's a killer for this query.  You should
 be able to get back the old behavior if you raise work_mem enough.
 
 Another idea you might think about is changing the OR'd IN conditions
 to a single IN over a UNION ALL of the subselects.  I'm not really sure if
 that would produce a better plan, but it's worth trying if it wouldn't
 require too much app-side contortion.


Hello,
you might try to use a CTE to first collect the IDs to exclude, and join them 
to your main table.
This should result in an anti join pattern.

Something like:

WITH IDS as (
SELECT U1.lead_id AS lead_id 
FROM event U1 
WHERE U1.event_type ='type_1'
UNION (
   SELECT U1.lead_id AS lead_id 
   FROM event U1 
   WHERE U1.event_type = 'type_2'
   INTERSECT
   SELECT U1.lead_id AS lead_id 
   FROM event U1 
   WHERE successful
)
)
SELECT * FROM lead LEFT OUTER JOIN IDS ON (lead.id=IDS.lead_id)
WHERE IDS.lead_id IS NULL;

regards,

Marc Mamin






 regards, tom lane
 
 


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Merlin Moncure
On Wed, Jul 8, 2015 at 12:48 PM, Craig James cja...@emolecules.com wrote:
 On Tue, Jul 7, 2015 at 10:31 PM, Joshua D. Drake j...@commandprompt.com
 wrote:


 On 07/07/2015 08:05 PM, Craig James wrote:



 No ideas, but I ran into the same thing. I have a set of C/C++ functions
 that put some chemistry calculations into Postgres as extensions (things
 like, calculate the molecular weight of this molecule). As SQL
 functions, the whole thing bogged down, and we never got the scalability
 we needed. On our 8-CPU setup, we couldn't get more than 2 CPUs busy at
 the same time, even with dozens of clients.

 When I moved these same functions into an Apache fast-CGI HTTP service
 (exact same code, same network overhead), I could easily scale up and
 use the full 100% of all eight CPUs.

 I have no idea why, and never investigated further. The convenience of
 having the functions in SQL wasn't that important.


 I admit that I haven't read this whole thread but:

 Using Apache Fast-CGI, you are going to fork a process for each instance
 of the function being executed and that in turn will use all CPUs up to the
 max available resource.

 With PostgreSQL, that isn't going to happen unless you are running (at
 least) 8 functions across 8 connections.


 Well, right, which is why I mentioned even with dozens of clients.
 Shouldn't that scale to at least all of the CPUs in use if the function is
 CPU intensive (which it is)?

only in the absence of inter-process locking and cache line bouncing.

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Craig James
On Tue, Jul 7, 2015 at 10:31 PM, Joshua D. Drake j...@commandprompt.com
wrote:


 On 07/07/2015 08:05 PM, Craig James wrote:



 No ideas, but I ran into the same thing. I have a set of C/C++ functions
 that put some chemistry calculations into Postgres as extensions (things
 like, calculate the molecular weight of this molecule). As SQL
 functions, the whole thing bogged down, and we never got the scalability
 we needed. On our 8-CPU setup, we couldn't get more than 2 CPUs busy at
 the same time, even with dozens of clients.

 When I moved these same functions into an Apache fast-CGI HTTP service
 (exact same code, same network overhead), I could easily scale up and
 use the full 100% of all eight CPUs.

 I have no idea why, and never investigated further. The convenience of
 having the functions in SQL wasn't that important.


 I admit that I haven't read this whole thread but:

 Using Apache Fast-CGI, you are going to fork a process for each instance
 of the function being executed and that in turn will use all CPUs up to the
 max available resource.

 With PostgreSQL, that isn't going to happen unless you are running (at
 least) 8 functions across 8 connections.


Well, right, which is why I mentioned even with dozens of clients.
Shouldn't that scale to at least all of the CPUs in use if the function is
CPU intensive (which it is)?

Craig




 JD

 --
 Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
 PostgreSQL Centered full stack support, consulting and development.
 Announcing I'm offended is basically telling the world you can't
 control your own emotions, so everyone else should do it for you.




-- 
-
Craig A. James
Chief Technology Officer
eMolecules, Inc.
-


Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Joshua D. Drake


On 07/08/2015 10:48 AM, Craig James wrote:


I admit that I haven't read this whole thread but:

Using Apache Fast-CGI, you are going to fork a process for each
instance of the function being executed and that in turn will use
all CPUs up to the max available resource.

With PostgreSQL, that isn't going to happen unless you are running
(at least) 8 functions across 8 connections.


Well, right, which is why I mentioned even with dozens of clients.
Shouldn't that scale to at least all of the CPUs in use if the function
is CPU intensive (which it is)?


In theory but that isn't PostgreSQL that does that, it will be the 
kernel scheduler. Although (and I am grasping at straws):


I wonder if the execution is taking place outside of the backend proper 
or... are you using a pooler?


JD




Craig




JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing I'm offended is basically telling the world you can't
control your own emotions, so everyone else should do it for you.




--
-
Craig A. James
Chief Technology Officer
eMolecules, Inc.
-



--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing I'm offended is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Craig James
On Wed, Jul 8, 2015 at 10:52 AM, Joshua D. Drake j...@commandprompt.com
wrote:


 On 07/08/2015 10:48 AM, Craig James wrote:

  I admit that I haven't read this whole thread but:

 Using Apache Fast-CGI, you are going to fork a process for each
 instance of the function being executed and that in turn will use
 all CPUs up to the max available resource.

 With PostgreSQL, that isn't going to happen unless you are running
 (at least) 8 functions across 8 connections.


 Well, right, which is why I mentioned even with dozens of clients.
 Shouldn't that scale to at least all of the CPUs in use if the function
 is CPU intensive (which it is)?


 In theory but that isn't PostgreSQL that does that, it will be the kernel
 scheduler. Although (and I am grasping at straws):

 I wonder if the execution is taking place outside of the backend proper
 or... are you using a pooler?


No pooler, and the functions were in an ordinary SQL extension .so library
and loaded as

   CREATE OR REPLACE FUNCTION funcname( ... ) returns ...
   AS 'libxxx.so', 'funcname LANGUAGE c STRICT IMMUTABLE COST 1;

Craig


 JD



 Craig




 JD

 --
 Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
 PostgreSQL Centered full stack support, consulting and development.
 Announcing I'm offended is basically telling the world you can't
 control your own emotions, so everyone else should do it for you.




 --
 -
 Craig A. James
 Chief Technology Officer
 eMolecules, Inc.
 -



 --
 Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
 PostgreSQL Centered full stack support, consulting and development.
 Announcing I'm offended is basically telling the world you can't
 control your own emotions, so everyone else should do it for you.




-- 
-
Craig A. James
Chief Technology Officer
eMolecules, Inc.
-


Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Andres Freund
On 2015-07-08 13:46:53 -0500, Merlin Moncure wrote:
 On Wed, Jul 8, 2015 at 12:48 PM, Craig James cja...@emolecules.com wrote:
  On Tue, Jul 7, 2015 at 10:31 PM, Joshua D. Drake j...@commandprompt.com
  Using Apache Fast-CGI, you are going to fork a process for each instance
  of the function being executed and that in turn will use all CPUs up to the
  max available resource.
 
  With PostgreSQL, that isn't going to happen unless you are running (at
  least) 8 functions across 8 connections.
 
 
  Well, right, which is why I mentioned even with dozens of clients.
  Shouldn't that scale to at least all of the CPUs in use if the function is
  CPU intensive (which it is)?
 
 only in the absence of inter-process locking and cache line bouncing.

And addititionally memory bandwidth (shared between everything, even in
the numa case), cross socket/bus bandwidth (absolutely performance
critical in multi-socket configurations), cache capacity (shared between
cores, and sometimes even sockets!).


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Tom Lane
and...@anarazel.de (Andres Freund) writes:
 On 2015-07-08 15:38:24 -0700, Craig James wrote:
 From my admittedly naive point of view, it's hard to see why any of this
 matters. I have functions that do purely CPU-intensive mathematical
 calculations ... you could imagine something like is_prime(N) that
 determines if N is a prime number. I have eight clients that connect to
 eight backends. Each client issues an SQL command like, select
 is_prime(N) where N is a simple number.

 I mostly replied to Merlin's general point (additionally in the context of
 plpgsql).

 But I have a hard time seing that postgres would be the bottleneck for a
 is_prime() function (or something with similar characteristics) that's
 written in C where the average runtime is more than, say, a couple
 thousand cyles.  I'd like to see a profile of that.

But that was not the case that Graeme was complaining about.  He's talking
about simple-arithmetic-and-looping written in plpgsql, in a volatile
function that is going to take a new snapshot for every statement, even if
that's only n := n+1.  So it's going to spend a substantial fraction of
its runtime banging on the ProcArray, and that doesn't scale.  If you
write your is_prime function purely in plpgsql, and don't bother to mark
it nonvolatile, *it will not scale*.  It'll be slow even in single-thread
terms, but it'll be particularly bad if you're saturating a multicore
machine with it.

One of my Salesforce colleagues has been looking into ways that we could
decide to skip the per-statement snapshot acquisition even in volatile
functions, if we could be sure that a particular statement isn't going to
do anything that would need a snapshot.  Now, IMO that doesn't really do
much for properly written plpgsql; but there's an awful lot of bad plpgsql
code out there, and it can make a huge difference for that.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Craig James
On Wed, Jul 8, 2015 at 1:27 PM, Andres Freund and...@anarazel.de wrote:

 On 2015-07-08 13:46:53 -0500, Merlin Moncure wrote:
  On Wed, Jul 8, 2015 at 12:48 PM, Craig James cja...@emolecules.com
 wrote:
   On Tue, Jul 7, 2015 at 10:31 PM, Joshua D. Drake j...@commandprompt.com
 
   Using Apache Fast-CGI, you are going to fork a process for each
 instance
   of the function being executed and that in turn will use all CPUs up
 to the
   max available resource.
  
   With PostgreSQL, that isn't going to happen unless you are running (at
   least) 8 functions across 8 connections.
  
  
   Well, right, which is why I mentioned even with dozens of clients.
   Shouldn't that scale to at least all of the CPUs in use if the
 function is
   CPU intensive (which it is)?
 
  only in the absence of inter-process locking and cache line bouncing.

 And addititionally memory bandwidth (shared between everything, even in
 the numa case), cross socket/bus bandwidth (absolutely performance
 critical in multi-socket configurations), cache capacity (shared between
 cores, and sometimes even sockets!).


From my admittedly naive point of view, it's hard to see why any of this
matters. I have functions that do purely CPU-intensive mathematical
calculations ... you could imagine something like is_prime(N) that
determines if N is a prime number. I have eight clients that connect to
eight backends. Each client issues an SQL command like, select
is_prime(N) where N is a simple number.

Are you saying that in order to calculate is_prime(N), all of that stuff
(inter-process locking, memory bandwith, bus bandwidth, cache capacity,
etc.) is even relevant? And if so, how is it that Postgres is so different
from an Apache fast-CGI program that runs the exact same is_prime(N)
calculation?

Just curious ... as I said, I've already implemented a different solution.

Craig


Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Andres Freund
On 2015-07-08 15:38:24 -0700, Craig James wrote:
 From my admittedly naive point of view, it's hard to see why any of this
 matters. I have functions that do purely CPU-intensive mathematical
 calculations ... you could imagine something like is_prime(N) that
 determines if N is a prime number. I have eight clients that connect to
 eight backends. Each client issues an SQL command like, select
 is_prime(N) where N is a simple number.

I mostly replied to Merlin's general point (additionally in the context of
plpgsql).

But I have a hard time seing that postgres would be the bottleneck for a
is_prime() function (or something with similar characteristics) that's
written in C where the average runtime is more than, say, a couple
thousand cyles.  I'd like to see a profile of that.

Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance