Re: [HACKERS] Parallel Seq Scan

Jim Nasby Fri, 05 Dec 2014 10:58:43 -0800

On 12/5/14, 9:08 AM, José Luis Tallón wrote:


More over, when load goes up, the relative cost of parallel working should go 
up as well.
Something like:
     p = number of cores
     l = 1min-load

     additional_cost = tuple estimate * cpu_tuple_cost * (l+1)/(c-1)

(for c>1, of course)

...

The parallel seq scan nodes are definitively the best approach for "parallel 
query", since the planner can optimize them based on cost.
I'm wondering about the ability to modify the implementation of some methods themselves 
once at execution time: given a previously planned query, chances are that, at execution 
time (I'm specifically thinking about prepared statements here), a different 
implementation of the same "node" might be more suitable and could be used 
instead while the condition holds.


These comments got me wondering... would it be better to decide on parallelism 
during execution instead of at plan time? That would allow us to dynamically 
scale parallelism based on system load. If we don't even consider parallelism 
until we've pulled some number of tuples/pages from a relation, this would also 
eliminate all parallel overhead on small relations.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

Reply via email to