Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-08-05 Thread Robert Haas
On Sat, Aug 5, 2017 at 6:17 PM, Peter Geoghegan wrote: > On Thu, May 4, 2017 at 7:20 PM, David Rowley > wrote: >> I ended up writing the attached (which I'd not intended to post until >> some time closer to when the doors open for PG11). At the moment

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-08-05 Thread Peter Geoghegan
On Thu, May 4, 2017 at 7:20 PM, David Rowley wrote: > I ended up writing the attached (which I'd not intended to post until > some time closer to when the doors open for PG11). At the moment it's > basically just a test patch to see how it affects things when we give

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-08 Thread Haribabu Kommi
On Mon, May 8, 2017 at 11:39 AM, David Rowley wrote: > > We really need a machine with good IO concurrency, and not too much > RAM to test these things out. It could well be that for a suitability > large enough table we'd want to scan a whole 1GB extent per worker.

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-07 Thread Thomas Munro
On Mon, May 8, 2017 at 1:39 PM, David Rowley wrote: > On 6 May 2017 at 13:44, Thomas Munro wrote: >> Experimentation required... > > Indeed. I do remember long discussions on this before Parallel seq > scan went in, but I don't recall

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-07 Thread David Rowley
On 6 May 2017 at 13:44, Thomas Munro wrote: > In Linux, each process that opens a file gets its own 'file' > object[1][5]. Each of those has it's own 'file_ra_state' > object[2][3], used by ondemand_readahead[4] for sequential read > detection. So I speculate that

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-07 Thread David Rowley
On 5 May 2017 at 14:54, Thomas Munro wrote: > Just for fun, check out pages 42 and 43 of Wei Hong's thesis. He > worked on Berkeley POSTGRES parallel query and a spin-off called XPRS, > and they got linear seq scan scaling up to number of spindles: > >

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-05 Thread Thomas Munro
On Sat, May 6, 2017 at 7:34 AM, Robert Haas wrote: > On Thu, May 4, 2017 at 10:20 PM, David Rowley > wrote: >> Now I'm not going to pretend that this patch is ready for the >> prime-time. I've not yet worked out how to properly report

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-05 Thread Peter Geoghegan
On Fri, May 5, 2017 at 12:40 PM, Robert Haas wrote: > One idea that crossed my mind is to just have workers write all of > their output tuples to a temp file and have the leader read them back > in. At some cost in I/O, this would completely eliminate the overhead > of

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-05 Thread Andres Freund
Hi, On 2017-05-05 15:29:40 -0400, Robert Haas wrote: > On Thu, May 4, 2017 at 9:37 PM, Andres Freund wrote: > It's pretty easy (but IMHO not very interesting) to measure internal > contention in the Parallel Seq Scan node. As David points out > downthread, that problem

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-05 Thread Robert Haas
On Thu, May 4, 2017 at 10:36 PM, David Rowley wrote: > On 3 May 2017 at 07:13, Robert Haas wrote: >> Multiple people (including David Rowley >> as well as folks here at EnterpriseDB) have demonstrated that for >> certain queries, we can

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-05 Thread Robert Haas
On Thu, May 4, 2017 at 10:20 PM, David Rowley wrote: > Now I'm not going to pretend that this patch is ready for the > prime-time. I've not yet worked out how to properly report sync-scan > locations without risking reporting later pages after reporting the > end of

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-05 Thread Robert Haas
On Thu, May 4, 2017 at 9:37 PM, Andres Freund wrote: > Have those benchmarks, even in a very informal form, been shared / > collected / referenced centrally? I'd be very interested to know where > the different contention points are. Possibilities: > > - in non-resident

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-05 Thread Amit Kapila
On Fri, May 5, 2017 at 7:07 AM, Andres Freund wrote: > On 2017-05-02 15:13:58 -0400, Robert Haas wrote: >> On Tue, Apr 18, 2017 at 2:48 AM, Amit Khandekar >> wrote: >> The main things that keeps this from being a crippling issue right now >> is the

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-05 Thread Amit Khandekar
On 5 May 2017 at 07:50, David Rowley wrote: > On 3 May 2017 at 07:13, Robert Haas wrote: >> It is of course possible that the Parallel Seq Scan could run into >> contention problems if the number of workers is large, but in my >> experience

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread Thomas Munro
On Fri, May 5, 2017 at 2:23 PM, David Rowley wrote: > On 5 May 2017 at 13:37, Andres Freund wrote: >> On 2017-05-02 15:13:58 -0400, Robert Haas wrote: >>> Multiple people (including David Rowley >>> as well as folks here at EnterpriseDB) have

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread Andres Freund
On 2017-05-04 19:45:33 -0700, Andres Freund wrote: > Increment phs_cblock without checking rs_nblocks, but outside of the > lock do a % scan->rs_nblocks, to get the "actual" position. Finish if > (phs_cblock - phs_startblock) / scan->rs_nblocks >= 1. Err, as I've been pointed to: It should be

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread Andres Freund
On 2017-05-05 14:40:43 +1200, David Rowley wrote: > On 5 May 2017 at 14:36, Andres Freund wrote: > > I wonder how much doing the atomic ops approach alone can help, that > > doesn't have the issue that the work might be unevenly distributed > > between pages. > > I wondered

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread David Rowley
On 5 May 2017 at 14:36, Andres Freund wrote: > I wonder how much doing the atomic ops approach alone can help, that > doesn't have the issue that the work might be unevenly distributed > between pages. I wondered that too, since I though the barrier for making this change

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread David Rowley
On 3 May 2017 at 07:13, Robert Haas wrote: > Multiple people (including David Rowley > as well as folks here at EnterpriseDB) have demonstrated that for > certain queries, we can actually use a lot more workers and everything > works great. The problem is that for other

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread Andres Freund
Hi, On 2017-05-05 14:20:48 +1200, David Rowley wrote: > Yeah, I did get some time to look over the contention in Parallel Seq > Scan a while back and I discovered that on the machine that I was > testing on. the lock obtained in heap_parallelscan_nextpage() was > causing workers to have to wait

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread David Rowley
On 5 May 2017 at 13:37, Andres Freund wrote: > On 2017-05-02 15:13:58 -0400, Robert Haas wrote: >> Multiple people (including David Rowley >> as well as folks here at EnterpriseDB) have demonstrated that for >> certain queries, we can actually use a lot more workers and

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread David Rowley
On 3 May 2017 at 07:13, Robert Haas wrote: > It is of course possible that the Parallel Seq Scan could run into > contention problems if the number of workers is large, but in my > experience there are bigger problems here. The non-parallel Seq Scan > can also contend --

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread Andres Freund
On 2017-05-02 15:13:58 -0400, Robert Haas wrote: > On Tue, Apr 18, 2017 at 2:48 AM, Amit Khandekar > wrote: > The main things that keeps this from being a crippling issue right now > is the fact that we tend not to use that many parallel workers in the > first place.

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-02 Thread Robert Haas
On Tue, Apr 18, 2017 at 2:48 AM, Amit Khandekar wrote: > After searching through earlier mails about parallel scan, I am not > sure whether the shared state was considered to be a potential factor > that might reduce parallel query gains, when deciding the calculation >