Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-10 Thread Robert Haas
On Fri, Nov 10, 2017 at 5:44 AM, Amit Kapila wrote: > I am seeing the assertion failure as below on executing the above > mentioned Create statement: > > TRAP: FailedAssertion("!(!(tup->t_data->t_infomask & 0x0008))", File: > "heapam.c", Line: 2634) > server closed the connection unexpectedly > Th

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-10 Thread Amit Kapila
On Fri, Nov 10, 2017 at 9:48 AM, Amit Kapila wrote: > On Fri, Nov 10, 2017 at 8:36 AM, Robert Haas wrote: >> On Thu, Nov 9, 2017 at 9:31 PM, Amit Kapila wrote: >>> Have you set force_parallel_mode=regress; before running the >>> statement? >> >> Yes, I tried that first. >> >>> If so, then why yo

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Amit Kapila
On Fri, Nov 10, 2017 at 8:36 AM, Robert Haas wrote: > On Thu, Nov 9, 2017 at 9:31 PM, Amit Kapila wrote: >> Have you set force_parallel_mode=regress; before running the >> statement? > > Yes, I tried that first. > >> If so, then why you need to tune other parallel query >> related parameters? > >

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Robert Haas
On Thu, Nov 9, 2017 at 9:31 PM, Amit Kapila wrote: > Have you set force_parallel_mode=regress; before running the > statement? Yes, I tried that first. > If so, then why you need to tune other parallel query > related parameters? Because I couldn't get it to break the other way, I then tried th

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Amit Kapila
On Fri, Nov 10, 2017 at 12:05 AM, Robert Haas wrote: > On Thu, Nov 9, 2017 at 12:08 AM, Amit Kapila wrote: >> This change looks suspicious to me. I think here we can't use the >> tupDesc constructed from targetlist. One problem, I could see is that >> the check for hasOid setting in tlist_match

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Robert Haas
On Thu, Nov 9, 2017 at 12:08 AM, Amit Kapila wrote: > This change looks suspicious to me. I think here we can't use the > tupDesc constructed from targetlist. One problem, I could see is that > the check for hasOid setting in tlist_matches_tupdesc won't give the > correct answer. In case of th

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-08 Thread Amit Kapila
On Wed, Nov 8, 2017 at 1:02 AM, Andres Freund wrote: > Hi, > > On 2017-11-06 10:56:43 +0530, Amit Kapila wrote: >> On Sun, Nov 5, 2017 at 6:54 AM, Andres Freund wrote >> > On 2017-11-05 01:05:59 +0100, Robert Haas wrote: >> >> skip-gather-project-v1.patch does what it says on the tin. I still >>

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-07 Thread Andres Freund
Hi, On 2017-11-06 10:56:43 +0530, Amit Kapila wrote: > On Sun, Nov 5, 2017 at 6:54 AM, Andres Freund wrote > > On 2017-11-05 01:05:59 +0100, Robert Haas wrote: > >> skip-gather-project-v1.patch does what it says on the tin. I still > >> don't have a test case for this, and I didn't find that it

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-06 Thread Jim Van Fleet
Hi -- pgsql-hackers-ow...@postgresql.org wrote on 11/06/2017 09:47:22 AM: > From: Andres Freund > > Hi, > > Please don't top-quote on postgresql lists. Sorry > > On 2017-11-06 09:44:24 -0600, Jim Van Fleet wrote: > > > >hammerdb, in this configuration, runs a variant of tpcc > > > > > > Ha

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-06 Thread Andres Freund
Hi, Please don't top-quote on postgresql lists. On 2017-11-06 09:44:24 -0600, Jim Van Fleet wrote: > > >hammerdb, in this configuration, runs a variant of tpcc > > > > Hard to believe that any of the changes here are relevant in that > > case - this is parallelism specific stuff. Whereas tpcc i

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-06 Thread Jim Van Fleet
correct > >hammerdb, in this configuration, runs a variant of tpcc > > Hard to believe that any of the changes here are relevant in that > case - this is parallelism specific stuff. Whereas tpcc is oltp, right? > > Andres > -- > Sent from my Android device with K-9 Mail. Please excuse my brevi

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-06 Thread Andres Freund
On November 6, 2017 7:30:49 AM PST, Jim Van Fleet wrote: >Andres Freund wrote on 11/05/2017 03:40:15 PM: > >hammerdb, in this configuration, runs a variant of tpcc Hard to believe that any of the changes here are relevant in that case - this is parallelism specific stuff. Whereas tpcc is oltp

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-06 Thread Jim Van Fleet
Andres Freund wrote on 11/05/2017 03:40:15 PM: hammerdb, in this configuration, runs a variant of tpcc > > What query(s) did you measure? > > Andres > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. >

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-05 Thread Amit Kapila
On Sun, Nov 5, 2017 at 6:54 AM, Andres Freund wrote > On 2017-11-05 01:05:59 +0100, Robert Haas wrote: >> skip-gather-project-v1.patch does what it says on the tin. I still >> don't have a test case for this, and I didn't find that it helped very >> much, I am also wondering in which case it can

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-05 Thread Andres Freund
Hi, On November 5, 2017 1:33:24 PM PST, Jim Van Fleet wrote: >Ran this change with hammerdb on a power 8 firestone > >with 2 socket, 20 core >9.6 base-- 451991 NOPM >0926_master -- 464385 NOPM >11_04master -- 449177 NOPM >11_04_patch -- 431423 NOPM >-- two socket patch is a little down

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-05 Thread Jim Van Fleet
Ran this change with hammerdb on a power 8 firestone with 2 socket, 20 core 9.6 base-- 451991 NOPM 0926_master -- 464385 NOPM 11_04master -- 449177 NOPM 11_04_patch -- 431423 NOPM -- two socket patch is a little down from previous base runs With one socket 9.6 base -- 393727 NO

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-05 Thread Robert Haas
On Sun, Nov 5, 2017 at 2:24 AM, Andres Freund wrote: >> shm-mq-reduce-receiver-latch-set-v1.patch causes the receiver to only >> consume input from the shared queue when the amount of unconsumed >> input exceeds 1/4 of the queue size. This caused a large performance >> improvement in my testing b

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-04 Thread Andres Freund
On 2017-11-05 01:05:59 +0100, Robert Haas wrote: > skip-gather-project-v1.patch does what it says on the tin. I still > don't have a test case for this, and I didn't find that it helped very > much, but it would probably help more in a test case with more > columns, and you said this looked like a

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-04 Thread Robert Haas
On Sat, Nov 4, 2017 at 5:55 PM, Andres Freund wrote: >> master: 21436.745, 20978.355, 19918.617 >> patch: 15896.573, 15880.652, 15967.176 >> >> Median-to-median, that's about a 24% improvement. > > Neat! With the attached stack of 4 patches, I get: 10811.768 ms, 10743.424 ms, 10632.006 ms, about

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-04 Thread Andres Freund
Hi, On 2017-11-04 16:38:31 +0530, Robert Haas wrote: > On hydra (PPC), these changes didn't help much. Timings: > > master: 29605.582, 29753.417, 30160.485 > patch: 28218.396, 27986.951, 26465.584 > > That's about a 5-6% improvement. On my MacBook, though, the > improvement was quite a bit more:

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-04 Thread Robert Haas
On Wed, Oct 18, 2017 at 3:09 AM, Andres Freund wrote: > 2) The spinlocks both on the the sending and receiving side a quite hot: > >rafia query leader: > + 36.16% postgres postgres[.] shm_mq_receive > + 19.49% postgres postgres[.] s_lock > + 13.24% postgres

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-23 Thread Amit Kapila
On Thu, Oct 19, 2017 at 1:16 AM, Robert Haas wrote: > On Tue, Oct 17, 2017 at 5:39 PM, Andres Freund wrote: > >>b) Use shm_mq_sendv in tqueue.c by batching up insertions into the >> queue whenever it's not empty when a tuple is ready. > > Batching them with what? I hate to postpone sen

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-23 Thread Amit Kapila
On Wed, Oct 18, 2017 at 3:09 AM, Andres Freund wrote: > Hi Everyone, > > On 2017-05-19 17:25:38 +0530, Rafia Sabih wrote: >> While analysing the performance of TPC-H queries for the newly developed >> parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed >> that the time taken

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-20 Thread Robert Haas
On Wed, Oct 18, 2017 at 4:30 PM, Andres Freund wrote: > Hm. I'm a bit confused/surprised by that. What'd be the worst that can > happen if we don't immediately detect that the other side is detached? > At least if we only do so in the non-blocking paths, the worst that > seems that could happen is

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-18 Thread Andres Freund
Hi, On 2017-10-18 15:46:39 -0400, Robert Haas wrote: > > 2) The spinlocks both on the the sending and receiving side a quite hot: > > > >rafia query leader: > > + 36.16% postgres postgres[.] shm_mq_receive > > + 19.49% postgres postgres[.] s_lock > > + 13.24%

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-18 Thread Robert Haas
On Tue, Oct 17, 2017 at 5:39 PM, Andres Freund wrote: > The precise query doesn't really matter, the observations here are more > general, I hope. > > 1) nodeGather.c re-projects every row from workers. As far as I can tell >that's now always exactly the same targetlist as it's coming from the

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-17 Thread Andres Freund
Hi, On 2017-10-17 14:39:57 -0700, Andres Freund wrote: > I've spent some time looking into this, and I'm not quite convinced this > is the right approach. Let me start by describing where I see the > current performance problems around gather stemming from. One further approach to several of the

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-17 Thread Andres Freund
Hi Everyone, On 2017-05-19 17:25:38 +0530, Rafia Sabih wrote: > While analysing the performance of TPC-H queries for the newly developed > parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed > that the time taken by gather node is significant. On investigation, as per > the

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-16 Thread Rafia Sabih
On Tue, Oct 17, 2017 at 3:22 AM, Andres Freund wrote: > Hi Rafia, > > On 2017-05-19 17:25:38 +0530, Rafia Sabih wrote: >> head: >> explain analyse select * from t where i < 3000; >> QUERY PLAN > > Could you share how exactly you generat

Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-16 Thread Andres Freund
Hi Rafia, On 2017-05-19 17:25:38 +0530, Rafia Sabih wrote: > head: > explain analyse select * from t where i < 3000; > QUERY PLAN Could you share how exactly you generated the data? Just so others can compare a bit better with your res

Re: [HACKERS] [POC] Faster processing at Gather node

2017-09-11 Thread Alexander Kuzmenkov
Thanks Rafia, Amit, now I understand the ideas behind the patch better. I'll see if I can look at the new one. -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make c

Re: [HACKERS] [POC] Faster processing at Gather node

2017-09-11 Thread Rafia Sabih
On Sat, Sep 9, 2017 at 8:14 AM, Amit Kapila wrote: > On Fri, Sep 8, 2017 at 11:07 PM, Alexander Kuzmenkov > wrote: >> Hi Rafia, >> >> I like the idea of reducing locking overhead by sending tuples in bulk. The >> implementation could probably be simpler: you could extend the API of shm_mq >> to d

Re: [HACKERS] [POC] Faster processing at Gather node

2017-09-08 Thread Amit Kapila
On Fri, Sep 8, 2017 at 11:07 PM, Alexander Kuzmenkov wrote: > Hi Rafia, > > I like the idea of reducing locking overhead by sending tuples in bulk. The > implementation could probably be simpler: you could extend the API of shm_mq > to decouple notifying the sender from actually putting data into

Re: [HACKERS] [POC] Faster processing at Gather node

2017-09-08 Thread Alexander Kuzmenkov
Hi Rafia, I like the idea of reducing locking overhead by sending tuples in bulk. The implementation could probably be simpler: you could extend the API of shm_mq to decouple notifying the sender from actually putting data into the queue (i.e., make shm_mq_notify_receiver public and make a va

Re: [HACKERS] [POC] Faster processing at Gather node

2017-05-19 Thread Amit Kapila
On Fri, May 19, 2017 at 5:58 PM, Robert Haas wrote: > On Fri, May 19, 2017 at 7:55 AM, Rafia Sabih > wrote: >> While analysing the performance of TPC-H queries for the newly developed >> parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed >> that the time taken by gather no

Re: [HACKERS] [POC] Faster processing at Gather node

2017-05-19 Thread Robert Haas
On Fri, May 19, 2017 at 7:55 AM, Rafia Sabih wrote: > While analysing the performance of TPC-H queries for the newly developed > parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed > that the time taken by gather node is significant. On investigation, as per > the current me

[HACKERS] [POC] Faster processing at Gather node

2017-05-19 Thread Rafia Sabih
Hello everybody, While analysing the performance of TPC-H queries for the newly developed parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed that the time taken by gather node is significant. On investigation, as per the current method it copies each tuple to the shared que