Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-05 Thread Claudio Freire
On Thu, Dec 5, 2013 at 1:03 PM, Metin Doslu wrote: >> From what I've seen so far the bigger problem than contention in the >> lwlocks itself, is the spinlock protecting the lwlocks... > > Postgres 9.3.1 also reports spindelay, it seems that there is no contention > on spinlocks. Did you check hu

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-05 Thread Metin Doslu
> From what I've seen so far the bigger problem than contention in the > lwlocks itself, is the spinlock protecting the lwlocks... Postgres 9.3.1 also reports spindelay, it seems that there is no contention on spinlocks. PID 21121 lwlock 0: shacq 0 exacq 33 blk 1 spindelay 0 PID 21121 lwlock 33:

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-05 Thread Metin Doslu
> You tested the correct branch, right? Which commit does "git rev-parse > HEAD" show? I applied last two patches manually on PostgreSQL 9.2 Stable.

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-05 Thread Andres Freund
On 2013-12-05 17:46:44 +0200, Metin Doslu wrote: > I tried your patches on next link. As you suspect I didn't see any > improvements. I tested it on PostgreSQL 9.2 Stable. You tested the correct branch, right? Which commit does "git rev-parse HEAD" show? But generally, as long as your profile hid

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-05 Thread Metin Doslu
> You could try my lwlock-scalability improvement patches - for some > workloads here, the improvements have been rather noticeable. Which > version are you testing? I tried your patches on next link. As you suspect I didn't see any improvements. I tested it on PostgreSQL 9.2 Stable. http://git.p

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-05 Thread Andres Freund
On 2013-12-05 11:33:29 +0200, Metin Doslu wrote: > > Is your workload bigger than RAM? > > RAM is bigger than workload (more than a couple of times). > > I think a good bit of the contention > > you're seeing in that listing is populating shared_buffers - and might > > actually vanish once you're

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-05 Thread Metin Doslu
> Is your workload bigger than RAM? RAM is bigger than workload (more than a couple of times). > I think a good bit of the contention > you're seeing in that listing is populating shared_buffers - and might > actually vanish once you're halfway cached. > From what I've seen so far the bigger prob

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-05 Thread Andres Freund
On 2013-12-05 11:15:20 +0200, Metin Doslu wrote: > > - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is > > disappeared for 8 core machines and come back with 16 core machines on > > Amazon EC2. Would it be related with PostgreSQL locking mechanism? > > If we build with -DLWLOCK_ST

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-05 Thread Metin Doslu
> - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is > disappeared for 8 core machines and come back with 16 core machines on > Amazon EC2. Would it be related with PostgreSQL locking mechanism? If we build with -DLWLOCK_STATS to print locking stats from PostgreSQL, we see tons of

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Amit Kapila
On Wed, Dec 4, 2013 at 11:49 PM, Metin Doslu wrote: > Here are some extra information: > > - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is > disappeared for 8 core machines and come back with 16 core machines on > Amazon EC2. Would it be related with PostgreSQL locking mechanism

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Amit Kapila
On Wed, Dec 4, 2013 at 10:40 AM, Claudio Freire wrote: > On Wed, Dec 4, 2013 at 12:57 AM, Amit Kapila wrote: >>> As a quick side, we also repeated the same experiment on an EC2 instance >>> with 16 CPU cores, and found that the scale out behavior became worse there. >>> (We also tried increasing

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu
> You could try my lwlock-scalability improvement patches - for some > workloads here, the improvements have been rather noticeable. Which > version are you testing? I'm testing with PostgreSQL 9.3.1.

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund
On 2013-12-04 20:19:55 +0200, Metin Doslu wrote: > - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is > disappeared for 8 core machines and come back with 16 core machines on > Amazon EC2. Would it be related with PostgreSQL locking mechanism? You could try my lwlock-scalability im

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu
Here are some extra information: - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is disappeared for 8 core machines and come back with 16 core machines on Amazon EC2. Would it be related with PostgreSQL locking mechanism? - I tried this test with 4 core machines including my perso

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu
> Didn't follow the thread from the start. So, this is EC2? Have you > checked, with a recent enough version of top or whatever, how much time > is reported as "stolen"? Yes, this EC2. "stolen" is randomly reported as 1, mostly as 0.

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund
On 2013-12-04 16:00:40 -0200, Claudio Freire wrote: > On Wed, Dec 4, 2013 at 1:54 PM, Andres Freund wrote: > > All that time is spent in your virtualization solution. One thing to try > > is to look on the host system, sometimes profiles there can be more > > meaningful. > > You cannot profile th

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu
> You could try HVM. I've noticed it fare better under heavy CPU load, > and it's not fully-HVM (it still uses paravirtualized network and > I/O). I already tried with HVM (cc2.8xlarge instance on Amazon EC2) and observed same problem.

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Claudio Freire
On Wed, Dec 4, 2013 at 1:54 PM, Andres Freund wrote: > On 2013-12-04 18:43:35 +0200, Metin Doslu wrote: >> > I'd strongly suggest doing a "perf record -g -a ; >> > perf report" run to check what's eating up the time. >> >> Here is one example: >> >> + 38.87% swapper [kernel.kallsyms] [k] hyp

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund
On 2013-12-04 18:43:35 +0200, Metin Doslu wrote: > > I'd strongly suggest doing a "perf record -g -a ; > > perf report" run to check what's eating up the time. > > Here is one example: > > + 38.87% swapper [kernel.kallsyms] [k] hypercall_page > + 9.32% postgres [kernel.kallsyms] [k] h

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu
> I'd strongly suggest doing a "perf record -g -a ; > perf report" run to check what's eating up the time. Here is one example: + 38.87% swapper [kernel.kallsyms] [k] hypercall_page + 9.32% postgres [kernel.kallsyms] [k] hypercall_page + 6.80% postgres [kernel.kallsyms] [k] xen_

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu
>Notice the huge %sy >What kind of VM are you using? HVM or paravirtual? This instance is paravirtual.

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund
On 2013-12-04 14:27:10 -0200, Claudio Freire wrote: > On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu wrote: > > > > Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple > > (#6) queries: Although there is no need for I/O, "wa" fluctuates between 0 > > and 1. > > > > procs ---

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Claudio Freire
On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu wrote: > > Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple > (#6) queries: Although there is no need for I/O, "wa" fluctuates between 0 > and 1. > > procs ---memory-- ---swap-- -io --system-- > -cpu--

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu
> I think all of this data cannot fit in shared_buffers, you might want to increase shared_buffers > to larger size (not 30GB but close to your data size) to see how it behaves. When I use shared_buffers larger than my data size such as 10 GB, results scale nearly as expected at least for this

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-03 Thread Claudio Freire
On Wed, Dec 4, 2013 at 12:57 AM, Amit Kapila wrote: >> As a quick side, we also repeated the same experiment on an EC2 instance >> with 16 CPU cores, and found that the scale out behavior became worse there. >> (We also tried increasing the shared_buffers to 30 GB. This change >> completely solved

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-03 Thread Amit Kapila
On Tue, Dec 3, 2013 at 7:11 PM, Metin Doslu wrote: > We have several independent tables on a multi-core machine serving Select > queries. These tables fit into memory; and each Select queries goes over one > table's pages sequentially. In this experiment, there are no indexes or > table joins. > >

[HACKERS] Parallel Select query performance and shared buffers

2013-12-03 Thread Metin Doslu
We have several independent tables on a multi-core machine serving Select queries. These tables fit into memory; and each Select queries goes over one table's pages sequentially. In this experiment, there are no indexes or table joins. When we send concurrent Select queries to these tables, query