Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-10 Thread Robert Haas
On Fri, Nov 10, 2017 at 5:44 AM, Amit Kapila wrote: > I am seeing the assertion failure as below on executing the above > mentioned Create statement: > > TRAP: FailedAssertion("!(!(tup->t_data->t_infomask & 0x0008))", File: > "heapam.c", Line: 2634) > server closed the connection unexpectedly > Th

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-10 Thread Amit Kapila
On Fri, Nov 10, 2017 at 9:48 AM, Amit Kapila wrote: > On Fri, Nov 10, 2017 at 8:36 AM, Robert Haas wrote: >> On Thu, Nov 9, 2017 at 9:31 PM, Amit Kapila wrote: >>> Have you set force_parallel_mode=regress; before running the >>> statement? >> >> Yes, I tried that first. >> >>> If so, then why yo

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Amit Kapila
On Fri, Nov 10, 2017 at 8:36 AM, Robert Haas wrote: > On Thu, Nov 9, 2017 at 9:31 PM, Amit Kapila wrote: >> Have you set force_parallel_mode=regress; before running the >> statement? > > Yes, I tried that first. > >> If so, then why you need to tune other parallel query >> related parameters? > >

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Robert Haas
On Thu, Nov 9, 2017 at 9:31 PM, Amit Kapila wrote: > Have you set force_parallel_mode=regress; before running the > statement? Yes, I tried that first. > If so, then why you need to tune other parallel query > related parameters? Because I couldn't get it to break the other way, I then tried th

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Amit Kapila
On Fri, Nov 10, 2017 at 12:05 AM, Robert Haas wrote: > On Thu, Nov 9, 2017 at 12:08 AM, Amit Kapila wrote: >> This change looks suspicious to me. I think here we can't use the >> tupDesc constructed from targetlist. One problem, I could see is that >> the check for hasOid setting in tlist_match

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Robert Haas
On Thu, Nov 9, 2017 at 12:08 AM, Amit Kapila wrote: > This change looks suspicious to me. I think here we can't use the > tupDesc constructed from targetlist. One problem, I could see is that > the check for hasOid setting in tlist_matches_tupdesc won't give the > correct answer. In case of th

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-08 Thread Amit Kapila
On Wed, Nov 8, 2017 at 1:02 AM, Andres Freund wrote: > Hi, > > On 2017-11-06 10:56:43 +0530, Amit Kapila wrote: >> On Sun, Nov 5, 2017 at 6:54 AM, Andres Freund wrote >> > On 2017-11-05 01:05:59 +0100, Robert Haas wrote: >> >> skip-gather-project-v1.patch does what it says on the tin. I still >>

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-07 Thread Andres Freund
Hi, On 2017-11-06 10:56:43 +0530, Amit Kapila wrote: > On Sun, Nov 5, 2017 at 6:54 AM, Andres Freund wrote > > On 2017-11-05 01:05:59 +0100, Robert Haas wrote: > >> skip-gather-project-v1.patch does what it says on the tin. I still > >> don't have a test case for this, and I didn't find that it

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-06 Thread Jim Van Fleet
Hi -- pgsql-hackers-ow...@postgresql.org wrote on 11/06/2017 09:47:22 AM: > From: Andres Freund > > Hi, > > Please don't top-quote on postgresql lists. Sorry > > On 2017-11-06 09:44:24 -0600, Jim Van Fleet wrote: > > > >hammerdb, in this configuration, runs a variant of tpcc > > > > > > Ha

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-06 Thread Andres Freund
Hi, Please don't top-quote on postgresql lists. On 2017-11-06 09:44:24 -0600, Jim Van Fleet wrote: > > >hammerdb, in this configuration, runs a variant of tpcc > > > > Hard to believe that any of the changes here are relevant in that > > case - this is parallelism specific stuff. Whereas tpcc i

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-06 Thread Jim Van Fleet
correct > >hammerdb, in this configuration, runs a variant of tpcc > > Hard to believe that any of the changes here are relevant in that > case - this is parallelism specific stuff. Whereas tpcc is oltp, right? > > Andres > -- > Sent from my Android device with K-9 Mail. Please excuse my brevi

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-06 Thread Andres Freund
On November 6, 2017 7:30:49 AM PST, Jim Van Fleet wrote: >Andres Freund wrote on 11/05/2017 03:40:15 PM: > >hammerdb, in this configuration, runs a variant of tpcc Hard to believe that any of the changes here are relevant in that case - this is parallelism specific stuff. Whereas tpcc is oltp

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-06 Thread Jim Van Fleet
Andres Freund wrote on 11/05/2017 03:40:15 PM: hammerdb, in this configuration, runs a variant of tpcc > > What query(s) did you measure? > > Andres > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. >

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-05 Thread Amit Kapila
On Sun, Nov 5, 2017 at 6:54 AM, Andres Freund wrote > On 2017-11-05 01:05:59 +0100, Robert Haas wrote: >> skip-gather-project-v1.patch does what it says on the tin. I still >> don't have a test case for this, and I didn't find that it helped very >> much, I am also wondering in which case it can

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-05 Thread Andres Freund
Hi, On November 5, 2017 1:33:24 PM PST, Jim Van Fleet wrote: >Ran this change with hammerdb on a power 8 firestone > >with 2 socket, 20 core >9.6 base-- 451991 NOPM >0926_master -- 464385 NOPM >11_04master -- 449177 NOPM >11_04_patch -- 431423 NOPM >-- two socket patch is a little down

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-05 Thread Jim Van Fleet
Ran this change with hammerdb on a power 8 firestone with 2 socket, 20 core 9.6 base-- 451991 NOPM 0926_master -- 464385 NOPM 11_04master -- 449177 NOPM 11_04_patch -- 431423 NOPM -- two socket patch is a little down from previous base runs With one socket 9.6 base -- 393727 NO

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-05 Thread Robert Haas
On Sun, Nov 5, 2017 at 2:24 AM, Andres Freund wrote: >> shm-mq-reduce-receiver-latch-set-v1.patch causes the receiver to only >> consume input from the shared queue when the amount of unconsumed >> input exceeds 1/4 of the queue size. This caused a large performance >> improvement in my testing b

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-04 Thread Andres Freund
On 2017-11-05 01:05:59 +0100, Robert Haas wrote: > skip-gather-project-v1.patch does what it says on the tin. I still > don't have a test case for this, and I didn't find that it helped very > much, but it would probably help more in a test case with more > columns, and you said this looked like a

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-04 Thread Robert Haas
On Sat, Nov 4, 2017 at 5:55 PM, Andres Freund wrote: >> master: 21436.745, 20978.355, 19918.617 >> patch: 15896.573, 15880.652, 15967.176 >> >> Median-to-median, that's about a 24% improvement. > > Neat! With the attached stack of 4 patches, I get: 10811.768 ms, 10743.424 ms, 10632.006 ms, about

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-04 Thread Andres Freund
Hi, On 2017-11-04 16:38:31 +0530, Robert Haas wrote: > On hydra (PPC), these changes didn't help much. Timings: > > master: 29605.582, 29753.417, 30160.485 > patch: 28218.396, 27986.951, 26465.584 > > That's about a 5-6% improvement. On my MacBook, though, the > improvement was quite a bit more:

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-04 Thread Robert Haas
On Wed, Oct 18, 2017 at 3:09 AM, Andres Freund wrote: > 2) The spinlocks both on the the sending and receiving side a quite hot: > >rafia query leader: > + 36.16% postgres postgres[.] shm_mq_receive > + 19.49% postgres postgres[.] s_lock > + 13.24% postgres

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-23 Thread Amit Kapila
On Thu, Oct 19, 2017 at 1:16 AM, Robert Haas wrote: > On Tue, Oct 17, 2017 at 5:39 PM, Andres Freund wrote: > >>b) Use shm_mq_sendv in tqueue.c by batching up insertions into the >> queue whenever it's not empty when a tuple is ready. > > Batching them with what? I hate to postpone sen

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-23 Thread Amit Kapila
On Wed, Oct 18, 2017 at 3:09 AM, Andres Freund wrote: > Hi Everyone, > > On 2017-05-19 17:25:38 +0530, Rafia Sabih wrote: >> While analysing the performance of TPC-H queries for the newly developed >> parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed >> that the time taken

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-20 Thread Robert Haas
On Wed, Oct 18, 2017 at 4:30 PM, Andres Freund wrote: > Hm. I'm a bit confused/surprised by that. What'd be the worst that can > happen if we don't immediately detect that the other side is detached? > At least if we only do so in the non-blocking paths, the worst that > seems that could happen is

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-18 Thread Andres Freund
Hi, On 2017-10-18 15:46:39 -0400, Robert Haas wrote: > > 2) The spinlocks both on the the sending and receiving side a quite hot: > > > >rafia query leader: > > + 36.16% postgres postgres[.] shm_mq_receive > > + 19.49% postgres postgres[.] s_lock > > + 13.24%

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-18 Thread Robert Haas
On Tue, Oct 17, 2017 at 5:39 PM, Andres Freund wrote: > The precise query doesn't really matter, the observations here are more > general, I hope. > > 1) nodeGather.c re-projects every row from workers. As far as I can tell >that's now always exactly the same targetlist as it's coming from the

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-17 Thread Andres Freund
Hi, On 2017-10-17 14:39:57 -0700, Andres Freund wrote: > I've spent some time looking into this, and I'm not quite convinced this > is the right approach. Let me start by describing where I see the > current performance problems around gather stemming from. One further approach to several of the

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-17 Thread Andres Freund
Hi Everyone, On 2017-05-19 17:25:38 +0530, Rafia Sabih wrote: > While analysing the performance of TPC-H queries for the newly developed > parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed > that the time taken by gather node is significant. On investigation, as per > the

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-16 Thread Rafia Sabih
On Tue, Oct 17, 2017 at 3:22 AM, Andres Freund wrote: > Hi Rafia, > > On 2017-05-19 17:25:38 +0530, Rafia Sabih wrote: >> head: >> explain analyse select * from t where i < 3000; >> QUERY PLAN > > Could you share how exactly you generat

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-16 Thread Andres Freund
Hi Rafia, On 2017-05-19 17:25:38 +0530, Rafia Sabih wrote: > head: > explain analyse select * from t where i < 3000; > QUERY PLAN Could you share how exactly you generated the data? Just so others can compare a bit better with your res

Re: [HACKERS] [POC] Faster processing at Gather node

2017-09-11 Thread Alexander Kuzmenkov
Thanks Rafia, Amit, now I understand the ideas behind the patch better. I'll see if I can look at the new one. -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make c

Re: [HACKERS] [POC] Faster processing at Gather node

2017-09-11 Thread Rafia Sabih
On Sat, Sep 9, 2017 at 8:14 AM, Amit Kapila wrote: > On Fri, Sep 8, 2017 at 11:07 PM, Alexander Kuzmenkov > wrote: >> Hi Rafia, >> >> I like the idea of reducing locking overhead by sending tuples in bulk. The >> implementation could probably be simpler: you could extend the API of shm_mq >> to d

Re: [HACKERS] [POC] Faster processing at Gather node

2017-09-08 Thread Amit Kapila
On Fri, Sep 8, 2017 at 11:07 PM, Alexander Kuzmenkov wrote: > Hi Rafia, > > I like the idea of reducing locking overhead by sending tuples in bulk. The > implementation could probably be simpler: you could extend the API of shm_mq > to decouple notifying the sender from actually putting data into

Re: [HACKERS] [POC] Faster processing at Gather node

2017-09-08 Thread Alexander Kuzmenkov
Hi Rafia, I like the idea of reducing locking overhead by sending tuples in bulk. The implementation could probably be simpler: you could extend the API of shm_mq to decouple notifying the sender from actually putting data into the queue (i.e., make shm_mq_notify_receiver public and make a va

Re: [HACKERS] [POC] Faster processing at Gather node

2017-05-19 Thread Amit Kapila
On Fri, May 19, 2017 at 5:58 PM, Robert Haas wrote: > On Fri, May 19, 2017 at 7:55 AM, Rafia Sabih > wrote: >> While analysing the performance of TPC-H queries for the newly developed >> parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed >> that the time taken by gather no

Re: [HACKERS] [POC] Faster processing at Gather node

2017-05-19 Thread Robert Haas
On Fri, May 19, 2017 at 7:55 AM, Rafia Sabih wrote: > While analysing the performance of TPC-H queries for the newly developed > parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed > that the time taken by gather node is significant. On investigation, as per > the current me