Re: How to set parallel_tuple_cost

2019-12-20 Thread Jeff Janes
On Fri, Dec 20, 2019 at 1:58 PM Tom Lane  wrote:

> Jeff Janes  writes:
> > The docs for parallel_tuple_cost are quite terse, as the reference
> section
> > of the docs usually are:
> > "Sets the planner's estimate of the cost of transferring one tuple from a
> > parallel worker process to another process. The default is 0.1."
>
> > If you were take the doc description literally, then the default value
> > seems much too high, as it doesn't take 10x the (default) cpu_tuple_cost
> to
> > transfer a tuple up from a parallel worker.
>
> Really?  If anything, I'd have thought it might be worse than 10x.
> Cross-process communication isn't cheap, at least not according to
> my instincts.
>

I was a bit surprised.  I set it up so that there was a fine-tunable filter
which can be applied in the parallel workers, and then only the surviving
tuples get passed up to the leader.  The use of a parallel seq scan didn't
become slower than the non-parallel version until over 95% of the tuples
were surviving the filter.  If I wanted to make the estimated cost
cross-over point match the measured time cross-over point, I had to mark
the parallel_tuple_cost down to about 0.011.  This was an 8 CPU machine, an
AWS m5.4xlarge, with max_parallel_workers_per_gather=7.  (On my crummy
2-CPU Windows 10 laptop running ubuntu via VirtualBox, the cross-over point
was closer to 40% of the tuples surviving, and the parallel_tuple_cost to
match cross-over point would be about 0.016, but I don't have enough RAM to
make a large enough all-in-shared-buffer table to really get a good
assessments).

My method was to make shared_buffers be a large fraction of RAM (55GB, out
of 64GB), then make a table slightly smaller than that and forced it into
shared_buffers with pg_prewarm.  I set seq_page_cost = random_age_cost = 0,
to accurately reflect the fact that no IO is occuring.

create table para_seq as select floor(random()*1)::int as id, random()
as x, md5(random()::text)||md5(random()::text) t from
generate_series(1,800*55);
vacuum ANALYZE para_seq ;
select pg_prewarm('para_seq');

explain (analyze, buffers, settings, timing off) select * from para_seq
where id<9500;

Where you can change the 9500 to tune the selectivity of the filter.  Is
this the correct way to try to isolate just the overhead of transferring of
a tuple away from other considerations so it can be measured?

I don't think the fact that EXPLAIN ANALYZE throws away the result set
without reading it should change anything.  Reading it should add the same
fixed overhead to both parallel and non-parallel, so would dilute out
percentage difference without change absolute differences.

I tried it with wider tuples as well, but not so wide they would activate
TOAST, and didn't really see a difference in the conclusion.


> > On the other hand, you
> > probably don't want a query which consumes 8x the CPU resources just to
> > finish only 5% faster (on an otherwise idle server with 8 CPUs).  Maybe
> > this Amdahl factor is what inspired the high default value?
>
> I think the large value of parallel_setup_cost is what's meant to
> discourage that scenario.
>

I think that can only account for overhead like forking and setting up
memory segments.  The overhead of moving around tuples (more than
single-threaded execution already moves them around) would need to scale
with the number of tuples moved around.

Cheers,

Jeff


Re: How to set parallel_tuple_cost

2019-12-20 Thread Andres Freund
Hi,

On 2019-12-20 13:58:57 -0500, Tom Lane wrote:
> Jeff Janes  writes:
> > The docs for parallel_tuple_cost are quite terse, as the reference section
> > of the docs usually are:
> > "Sets the planner's estimate of the cost of transferring one tuple from a
> > parallel worker process to another process. The default is 0.1."
> 
> > If you were take the doc description literally, then the default value
> > seems much too high, as it doesn't take 10x the (default) cpu_tuple_cost to
> > transfer a tuple up from a parallel worker.
> 
> Really?  If anything, I'd have thought it might be worse than 10x.
> Cross-process communication isn't cheap, at least not according to
> my instincts.

+1. I did at some point measure the cost of transferring through a
tuplequeue, and it's quite expensive, compared to local tuple
handoff. Some of that is not intrinsic, and could be fixed - e.g. by
just putting pointers to tuples into the queue, instead of the whole
tuple (but that's hard due to our process model leading to dynamic shm
having differing addresses).  What's worse, putting a tuple into a
tuplequeue requires the input slot to be materialized into a HeapTuple
(should probably be MinimalTuple), which often the input will not
yet be.  So I think it'll often be much worse than 10x.

Greetings,

Andres Freund




Re: How to set parallel_tuple_cost

2019-12-20 Thread Tom Lane
Jeff Janes  writes:
> The docs for parallel_tuple_cost are quite terse, as the reference section
> of the docs usually are:
> "Sets the planner's estimate of the cost of transferring one tuple from a
> parallel worker process to another process. The default is 0.1."

> If you were take the doc description literally, then the default value
> seems much too high, as it doesn't take 10x the (default) cpu_tuple_cost to
> transfer a tuple up from a parallel worker.

Really?  If anything, I'd have thought it might be worse than 10x.
Cross-process communication isn't cheap, at least not according to
my instincts.

> On the other hand, you
> probably don't want a query which consumes 8x the CPU resources just to
> finish only 5% faster (on an otherwise idle server with 8 CPUs).  Maybe
> this Amdahl factor is what inspired the high default value?

I think the large value of parallel_setup_cost is what's meant to
discourage that scenario.

regards, tom lane




Re: How to set parallel_tuple_cost

2019-12-20 Thread Laurenz Albe
On Fri, 2019-12-20 at 13:03 -0500, Jeff Janes wrote:
> The docs for parallel_tuple_cost are quite terse, as the reference section of 
> the docs usually are:
> 
> "Sets the planner's estimate of the cost of transferring one tuple from a 
> parallel worker process to another process. The default is 0.1."
> 
> Usually you can find more extensive discussion of such settings in informal 
> resources like blog posts or mailing lists,
> but Googling the name I don't find much for this setting.  Is there good 
> information out there somewhere?
> 
> If you were take the doc description literally, then the default value seems 
> much too high, as it doesn't take
> 10x the (default) cpu_tuple_cost to transfer a tuple up from a parallel 
> worker.  On the other hand, you probably
> don't want a query which consumes 8x the CPU resources just to finish only 5% 
> faster (on an otherwise idle server with 8 CPUs).
> Maybe this Amdahl factor is what inspired the high default value?

Hmm.  The parameter was introduced into the discussion here:
https://www.postgresql.org/message-id/CAA4eK1L0dk9D3hARoAb84v2pGvUw4B5YoS4x18ORQREwR%2B1VCg%40mail.gmail.com
and while the name was changed from "cpu_tuple_comm_cost" to 
"parallel_tuple_cost"
later, the default value seems not to have been the subject of discussion.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





How to set parallel_tuple_cost

2019-12-20 Thread Jeff Janes
The docs for parallel_tuple_cost are quite terse, as the reference section
of the docs usually are:

"Sets the planner's estimate of the cost of transferring one tuple from a
parallel worker process to another process. The default is 0.1."

Usually you can find more extensive discussion of such settings in
informal resources like blog posts or mailing lists, but Googling the name
I don't find much for this setting.  Is there good information out there
somewhere?

If you were take the doc description literally, then the default value
seems much too high, as it doesn't take 10x the (default) cpu_tuple_cost to
transfer a tuple up from a parallel worker.  On the other hand, you
probably don't want a query which consumes 8x the CPU resources just to
finish only 5% faster (on an otherwise idle server with 8 CPUs).  Maybe
this Amdahl factor is what inspired the high default value?

Cheers,

Jeff