Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-24 Thread Andreas Pflug
Am 23.01.14 02:14, schrieb Jim Nasby: > On 1/19/14, 5:51 PM, Dave Chinner wrote: >>> Postgres is far from being the only application that wants this; many >>> >people resort to tmpfs because of this: >>> >https://lwn.net/Articles/499410/ >> Yes, we covered the possibility of using tmpfs much earlie

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-23 Thread Gregory Smith
On 1/20/14 9:46 AM, Mel Gorman wrote: They could potentially be used to evalate any IO scheduler changes. For example -- deadline scheduler with these parameters has X transactions/sec throughput with average latency of Y millieseconds and a maximum fsync latency of Z seconds. Evaluate how well

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-22 Thread Claudio Freire
On Wed, Jan 22, 2014 at 10:08 PM, Jim Nasby wrote: > > Probably more useful is the case of index scans; if we pre-read more data > from the index we could hand the kernel a list of base relation blocks that > we know we'll need. Actually, I've already tried this. The most important part is fetch

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-22 Thread Jim Nasby
On 1/19/14, 5:51 PM, Dave Chinner wrote: Postgres is far from being the only application that wants this; many >people resort to tmpfs because of this: >https://lwn.net/Articles/499410/ Yes, we covered the possibility of using tmpfs much earlier in the thread, and came to the conclusion that tem

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-22 Thread Jim Nasby
On 1/17/14, 2:24 PM, Gregory Smith wrote: I am skeptical that the database will take over very much of this work and perform better than the Linux kernel does. My take is that our most useful role would be providing test cases kernel developers can add to a performance regression suite. Ugly

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-22 Thread Jim Nasby
On 1/17/14, 7:57 AM, Robert Haas wrote: - WAL files are written (and sometimes read) sequentially and fsync'd very frequently and it's always good to write the data out to disk as soon as possible - Temp files are written and read sequentially and never fsync'd. They should only be written to dis

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-22 Thread Bruce Momjian
On Tue, Jan 21, 2014 at 09:20:52PM +0100, Jan Kara wrote: > > If we're forcing the WAL out to disk because of transaction commit or > > because we need to write the buffer protected by a certain WAL record > > only after the WAL hits the platter, then it's fine. But sometimes > > we're writing WAL

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-22 Thread Jan Kara
On Fri 17-01-14 08:57:25, Robert Haas wrote: > On Fri, Jan 17, 2014 at 7:34 AM, Jeff Layton wrote: > > So this says to me that the WAL is a place where DIO should really be > > reconsidered. It's mostly sequential writes that need to hit the disk > > ASAP, and you need to know that they have hit t

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-22 Thread Jan Kara
On Wed 22-01-14 09:07:19, Dave Chinner wrote: > On Tue, Jan 21, 2014 at 09:20:52PM +0100, Jan Kara wrote: > > > If we're forcing the WAL out to disk because of transaction commit or > > > because we need to write the buffer protected by a certain WAL record > > > only after the WAL hits the platter

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-22 Thread Dave Chinner
On Tue, Jan 21, 2014 at 09:20:52PM +0100, Jan Kara wrote: > On Fri 17-01-14 08:57:25, Robert Haas wrote: > > On Fri, Jan 17, 2014 at 7:34 AM, Jeff Layton wrote: > > > So this says to me that the WAL is a place where DIO should really be > > > reconsidered. It's mostly sequential writes that need t

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-22 Thread Robert Haas
On Tue, Jan 21, 2014 at 3:20 PM, Jan Kara wrote: >> But that still doesn't work out very well, because now the guy who >> does the write() has to wait for it to finish before he can do >> anything else. That's not always what we want, because WAL gets >> written out from our internal buffers for

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-20 Thread Bruce Momjian
On Wed, Jan 15, 2014 at 11:49:09AM +, Mel Gorman wrote: > It may be the case that mmap/madvise is still required to handle a double > buffering problem but it's far from being a free lunch and it has costs > that read/write does not have to deal with. Maybe some of these problems > can be fixed

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-20 Thread Jeff Layton
On Mon, 20 Jan 2014 10:51:41 +1100 Dave Chinner wrote: > On Sun, Jan 19, 2014 at 03:37:37AM +0200, Marti Raudsepp wrote: > > On Wed, Jan 15, 2014 at 5:34 AM, Jim Nasby wrote: > > > it's very common to create temporary file data that will never, ever, ever > > > actually NEED to hit disk. Where I

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-20 Thread Mel Gorman
On Fri, Jan 17, 2014 at 03:24:01PM -0500, Gregory Smith wrote: > On 1/17/14 10:37 AM, Mel Gorman wrote: > >There is not an easy way to tell. To be 100%, it would require an > >instrumentation patch or a systemtap script to detect when a > >particular page is being written back and track the context

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-20 Thread Mel Gorman
On Mon, Jan 20, 2014 at 10:51:41AM +1100, Dave Chinner wrote: > On Sun, Jan 19, 2014 at 03:37:37AM +0200, Marti Raudsepp wrote: > > On Wed, Jan 15, 2014 at 5:34 AM, Jim Nasby wrote: > > > it's very common to create temporary file data that will never, ever, ever > > > actually NEED to hit disk. Wh

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-20 Thread Dave Chinner
On Sun, Jan 19, 2014 at 03:37:37AM +0200, Marti Raudsepp wrote: > On Wed, Jan 15, 2014 at 5:34 AM, Jim Nasby wrote: > > it's very common to create temporary file data that will never, ever, ever > > actually NEED to hit disk. Where I work being able to tell the kernel to > > avoid flushing those f

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-20 Thread Marti Raudsepp
On Mon, Jan 20, 2014 at 1:51 AM, Dave Chinner wrote: >> Postgres is far from being the only application that wants this; many >> people resort to tmpfs because of this: >> https://lwn.net/Articles/499410/ > > Yes, we covered the possibility of using tmpfs much earlier in the > thread, and came to

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-17 Thread Gregory Smith
On 1/17/14 10:37 AM, Mel Gorman wrote: There is not an easy way to tell. To be 100%, it would require an instrumentation patch or a systemtap script to detect when a particular page is being written back and track the context. There are approximations though. Monitor nr_dirty pages over time.

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-17 Thread Hannu Krosing
On 01/17/2014 06:40 AM, Dave Chinner wrote: > On Thu, Jan 16, 2014 at 08:48:24PM -0500, Robert Haas wrote: >> On Thu, Jan 16, 2014 at 7:31 PM, Dave Chinner wrote: >>> But there's something here that I'm not getting - you're talking >>> about a data set that you want ot keep cache resident that is

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-17 Thread Mel Gorman
On Thu, Jan 16, 2014 at 04:30:59PM -0800, Jeff Janes wrote: > On Wed, Jan 15, 2014 at 2:08 AM, Mel Gorman wrote: > > > On Tue, Jan 14, 2014 at 09:30:19AM -0800, Jeff Janes wrote: > > > > > > > > That could be something we look at. There are cases buried deep in the > > > > VM where pages get shuf

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-17 Thread Robert Haas
On Fri, Jan 17, 2014 at 7:34 AM, Jeff Layton wrote: > So this says to me that the WAL is a place where DIO should really be > reconsidered. It's mostly sequential writes that need to hit the disk > ASAP, and you need to know that they have hit the disk before you can > proceed with other operation

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-17 Thread Jeff Layton
On Thu, 16 Jan 2014 20:48:24 -0500 Robert Haas wrote: > On Thu, Jan 16, 2014 at 7:31 PM, Dave Chinner wrote: > > But there's something here that I'm not getting - you're talking > > about a data set that you want ot keep cache resident that is at > > least an order of magnitude larger than the c

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-17 Thread Dave Chinner
On Thu, Jan 16, 2014 at 08:48:24PM -0500, Robert Haas wrote: > On Thu, Jan 16, 2014 at 7:31 PM, Dave Chinner wrote: > > But there's something here that I'm not getting - you're talking > > about a data set that you want ot keep cache resident that is at > > least an order of magnitude larger than

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-17 Thread Dave Chinner
On Wed, Jan 15, 2014 at 06:14:18PM -0600, Jim Nasby wrote: > On 1/15/14, 12:00 AM, Claudio Freire wrote: > >My completely unproven theory is that swapping is overwhelmed by > >near-misses. Ie: a process touches a page, and before it's > >actually swapped in, another process touches it too, blocking

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-17 Thread Dave Chinner
On Thu, Jan 16, 2014 at 03:58:56PM -0800, Jeff Janes wrote: > On Thu, Jan 16, 2014 at 3:23 PM, Dave Chinner wrote: > > > On Wed, Jan 15, 2014 at 06:14:18PM -0600, Jim Nasby wrote: > > > On 1/15/14, 12:00 AM, Claudio Freire wrote: > > > >My completely unproven theory is that swapping is overwhelme

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Robert Haas
On Thu, Jan 16, 2014 at 7:31 PM, Dave Chinner wrote: > But there's something here that I'm not getting - you're talking > about a data set that you want ot keep cache resident that is at > least an order of magnitude larger than the cyclic 5-15 minute WAL > dataset that ongoing operations need to

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Jeff Janes
On Thu, Jan 16, 2014 at 3:23 PM, Dave Chinner wrote: > On Wed, Jan 15, 2014 at 06:14:18PM -0600, Jim Nasby wrote: > > On 1/15/14, 12:00 AM, Claudio Freire wrote: > > >My completely unproven theory is that swapping is overwhelmed by > > >near-misses. Ie: a process touches a page, and before it's >

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Theodore Ts'o
On Wed, Jan 15, 2014 at 10:35:44AM +0100, Jan Kara wrote: > Filesystems could in theory provide facility like atomic write (at least up > to a certain size say in MB range) but it's not so easy and when there are > no strong usecases fs people are reluctant to make their code more complex > unneces

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Jeff Layton
On Wed, 15 Jan 2014 21:37:16 -0500 Robert Haas wrote: > On Wed, Jan 15, 2014 at 8:41 PM, Jan Kara wrote: > > On Wed 15-01-14 10:12:38, Robert Haas wrote: > >> On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara wrote: > >> > Filesystems could in theory provide facility like atomic write (at least > >> >

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Dave Chinner
On Wed, Jan 15, 2014 at 07:31:15PM -0500, Tom Lane wrote: > Dave Chinner writes: > > On Wed, Jan 15, 2014 at 07:08:18PM -0500, Tom Lane wrote: > >> No, we'd be happy to re-request it during each checkpoint cycle, as > >> long as that wasn't an unduly expensive call to make. I'm not quite > >> sur

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Jan Kara
On Wed 15-01-14 10:12:38, Robert Haas wrote: > On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara wrote: > > Filesystems could in theory provide facility like atomic write (at least up > > to a certain size say in MB range) but it's not so easy and when there are > > no strong usecases fs people are reluct

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Jan Kara
On Wed 15-01-14 21:37:16, Robert Haas wrote: > On Wed, Jan 15, 2014 at 8:41 PM, Jan Kara wrote: > > On Wed 15-01-14 10:12:38, Robert Haas wrote: > >> On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara wrote: > >> > Filesystems could in theory provide facility like atomic write (at least > >> > up > >> >

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Jeremy Harris
On 14/01/14 22:23, Dave Chinner wrote: On Tue, Jan 14, 2014 at 11:40:38AM -0800, Kevin Grittner wrote: To quantify that, in a production setting we were seeing pauses of up to two minutes with shared_buffers set to 8GB and default dirty ^^

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread knizhnik
I wonder if kernel can sometimes provide weaker version of fsync() which is not enforcing all pending data to be written immediately but just servers as write barrier, guaranteeing that all write operations preceding fsync() will be completed before any of subsequent operations. It will allow

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Robert Haas
On Wed, Jan 15, 2014 at 8:41 PM, Jan Kara wrote: > On Wed 15-01-14 10:12:38, Robert Haas wrote: >> On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara wrote: >> > Filesystems could in theory provide facility like atomic write (at least up >> > to a certain size say in MB range) but it's not so easy and whe

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Dave Chinner
On Wed, Jan 15, 2014 at 07:08:18PM -0500, Tom Lane wrote: > Dave Chinner writes: > > On Wed, Jan 15, 2014 at 10:12:38AM -0500, Tom Lane wrote: > >> What we'd really like for checkpointing is to hand the kernel a boatload > >> (several GB) of dirty pages and say "how about you push all this to disk

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Dave Chinner
On Wed, Jan 15, 2014 at 10:12:38AM -0500, Robert Haas wrote: > On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara wrote: > > Filesystems could in theory provide facility like atomic write (at least up > > to a certain size say in MB range) but it's not so easy and when there are > > no strong usecases fs p

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Dave Chinner
On Wed, Jan 15, 2014 at 07:13:27PM -0500, Tom Lane wrote: > Dave Chinner writes: > > On Wed, Jan 15, 2014 at 02:29:40PM -0800, Jeff Janes wrote: > >> And most importantly, "Also, please don't freeze up everything else in the > >> process" > > > If you hand writeback off to the kernel, then writeb

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Dave Chinner
On Wed, Jan 15, 2014 at 02:29:40PM -0800, Jeff Janes wrote: > On Wed, Jan 15, 2014 at 7:12 AM, Tom Lane wrote: > > > Heikki Linnakangas writes: > > > On 01/15/2014 07:50 AM, Dave Chinner wrote: > > >> FWIW [and I know you're probably sick of hearing this by now], but > > >> the blk-io throttling

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Dave Chinner
On Wed, Jan 15, 2014 at 10:12:38AM -0500, Tom Lane wrote: > Heikki Linnakangas writes: > > On 01/15/2014 07:50 AM, Dave Chinner wrote: > >> FWIW [and I know you're probably sick of hearing this by now], but > >> the blk-io throttling works almost perfectly with applications that > >> use direct IO

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Tom Lane
Robert Haas writes: > I don't see that as a problem. What we're struggling with today is > that, until we fsync(), the system is too lazy about writing back > dirty pages. And then when we fsync(), it becomes very aggressive and > system-wide throughput goes into the tank. What we're aiming to

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Tom Lane
Dave Chinner writes: > On Wed, Jan 15, 2014 at 07:08:18PM -0500, Tom Lane wrote: >> No, we'd be happy to re-request it during each checkpoint cycle, as >> long as that wasn't an unduly expensive call to make. I'm not quite >> sure where such requests ought to "live" though. One idea is to tie >>

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Robert Haas
On Wed, Jan 15, 2014 at 7:22 PM, Dave Chinner wrote: > No, I meant the opposite - in low memory situations, the system is > going to go to hell in a handbasket because we are going to cause a > writeback IO storm cleaning memory regardless of these IO > priorities. i.e. there is no way we'll let "

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Tom Lane
Dave Chinner writes: > On Wed, Jan 15, 2014 at 02:29:40PM -0800, Jeff Janes wrote: >> And most importantly, "Also, please don't freeze up everything else in the >> process" > If you hand writeback off to the kernel, then writeback for memory > reclaim needs to take precedence over "metered writeb

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Tom Lane
Dave Chinner writes: > On Wed, Jan 15, 2014 at 10:12:38AM -0500, Tom Lane wrote: >> What we'd really like for checkpointing is to hand the kernel a boatload >> (several GB) of dirty pages and say "how about you push all this to disk >> over the next few minutes, in whatever way seems optimal given

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Jeff Janes
On Wed, Jan 15, 2014 at 7:12 AM, Tom Lane wrote: > Heikki Linnakangas writes: > > On 01/15/2014 07:50 AM, Dave Chinner wrote: > >> FWIW [and I know you're probably sick of hearing this by now], but > >> the blk-io throttling works almost perfectly with applications that > >> use direct IO. >

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Jan Kara
On Wed 15-01-14 14:38:44, Hannu Krosing wrote: > On 01/15/2014 02:01 PM, Jan Kara wrote: > > On Wed 15-01-14 12:16:50, Hannu Krosing wrote: > >> On 01/14/2014 06:12 PM, Robert Haas wrote: > >>> This would be pretty similar to copy-on-write, except > >>> without the copying. It would just be > >>> f

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Claudio Freire
On Wed, Jan 15, 2014 at 3:41 PM, Stephen Frost wrote: > * Claudio Freire (klaussfre...@gmail.com) wrote: >> But, still, the implementation is very similar to what postgres needs: >> sharing a physical page for two distinct logical pages, efficiently, >> with efficient copy-on-write. > > Agreed, ex

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Stephen Frost
* Claudio Freire (klaussfre...@gmail.com) wrote: > But, still, the implementation is very similar to what postgres needs: > sharing a physical page for two distinct logical pages, efficiently, > with efficient copy-on-write. Agreed, except that KSM seems like it'd be slow/lazy about it and I'm gue

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Claudio Freire
On Wed, Jan 15, 2014 at 1:35 PM, Stephen Frost wrote: >> And there's a nice bingo. Had forgotten about KSM. KSM could help lots. >> >> I could try to see of madvising shared_buffers as mergeable helps. But >> this should be an automatic case of KSM - ie, when reading into a >> page-aligned address

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Stephen Frost
* Claudio Freire (klaussfre...@gmail.com) wrote: > Yes, that's basically zero-copy reads. > > It could be done. The kernel can remap the page to the physical page > holding the shared buffer and mark it read-only, then expire the > buffer and transfer ownership of the page if any page fault happen

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Robert Haas
On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara wrote: > Filesystems could in theory provide facility like atomic write (at least up > to a certain size say in MB range) but it's not so easy and when there are > no strong usecases fs people are reluctant to make their code more complex > unnecessarily.

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Tom Lane
Heikki Linnakangas writes: > On 01/15/2014 07:50 AM, Dave Chinner wrote: >> FWIW [and I know you're probably sick of hearing this by now], but >> the blk-io throttling works almost perfectly with applications that >> use direct IO. > For checkpoint writes, direct I/O actually would be reasona

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Heikki Linnakangas
On 01/15/2014 07:50 AM, Dave Chinner wrote: However, the first problem is dealing with the IO storm problem on fsync. Then we can measure the effect of spreading those writes out in time and determine what triggers read starvations (if they are apparent). The we can look at whether IO scheduling

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Robert Haas
On Tue, Jan 14, 2014 at 5:23 PM, Dave Chinner wrote: > By default, background writeback doesn't start until 10% of memory > is dirtied, and on your machine that's 25GB of RAM. That's way to > high for your workload. > > It appears to me that we are seeing large memory machines much more > commonly

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Robert Haas
On Tue, Jan 14, 2014 at 4:23 PM, James Bottomley wrote: > Yes, that's what I was thinking: it's a cache. About how many files > comprise this cache? Are you thinking it's too difficult for every > process to map the files? No, I'm thinking that would throw cache coherency out the window. Separa

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Hannu Krosing
On 01/15/2014 02:01 PM, Jan Kara wrote: > On Wed 15-01-14 12:16:50, Hannu Krosing wrote: >> On 01/14/2014 06:12 PM, Robert Haas wrote: >>> This would be pretty similar to copy-on-write, except >>> without the copying. It would just be >>> forget-from-the-buffer-pool-on-write. >> +1 >> >> A version

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Jan Kara
On Wed 15-01-14 12:16:50, Hannu Krosing wrote: > On 01/14/2014 06:12 PM, Robert Haas wrote: > > This would be pretty similar to copy-on-write, except > > without the copying. It would just be > > forget-from-the-buffer-pool-on-write. > > +1 > > A version of this could probably already be impleme

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Jan Kara
On Wed 15-01-14 10:27:26, Heikki Linnakangas wrote: > On 01/15/2014 06:01 AM, Jim Nasby wrote: > >For the sake of completeness... it's theoretically silly that Postgres > >is doing all this stuff with WAL when the filesystem is doing something > >very similar with it's journal. And an SSD drive (an

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Dave Chinner
On Tue, Jan 14, 2014 at 09:54:20PM -0600, Jim Nasby wrote: > On 1/14/14, 3:41 PM, Dave Chinner wrote: > >On Tue, Jan 14, 2014 at 09:40:48AM -0500, Robert Haas wrote: > >>On Mon, Jan 13, 2014 at 5:26 PM, Mel Gorman > >>wrote: Whether the problem is with the system call or the > >>programmer is hard

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Mel Gorman
On Mon, Jan 13, 2014 at 02:19:56PM -0800, James Bottomley wrote: > On Mon, 2014-01-13 at 22:12 +0100, Andres Freund wrote: > > On 2014-01-13 12:34:35 -0800, James Bottomley wrote: > > > On Mon, 2014-01-13 at 14:32 -0600, Jim Nasby wrote: > > > > Well, if we were to collaborate with the kernel commu

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Hannu Krosing
On 01/15/2014 12:16 PM, Hannu Krosing wrote: > On 01/14/2014 06:12 PM, Robert Haas wrote: >> This would be pretty similar to copy-on-write, except >> without the copying. It would just be >> forget-from-the-buffer-pool-on-write. > +1 > > A version of this could probably already be implement using

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Hannu Krosing
On 01/14/2014 06:12 PM, Robert Haas wrote: > This would be pretty similar to copy-on-write, except > without the copying. It would just be > forget-from-the-buffer-pool-on-write. +1 A version of this could probably already be implement using MADV_DONTNEED and MADV_WILLNEED Thet is, just after r

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-15 Thread Heikki Linnakangas
On 01/15/2014 06:01 AM, Jim Nasby wrote: For the sake of completeness... it's theoretically silly that Postgres is doing all this stuff with WAL when the filesystem is doing something very similar with it's journal. And an SSD drive (and next generation spinning rust) is doing the same thing *aga

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Jim Nasby
On 1/14/14, 10:08 AM, Tom Lane wrote: Trond Myklebust writes: On Jan 14, 2014, at 10:39, Tom Lane wrote: "Don't be aggressive" isn't good enough. The prohibition on early write has to be absolute, because writing a dirty page before we've done whatever else we need to do results in a corrupt

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Jim Nasby
On 1/14/14, 3:41 PM, Dave Chinner wrote: On Tue, Jan 14, 2014 at 09:40:48AM -0500, Robert Haas wrote: On Mon, Jan 13, 2014 at 5:26 PM, Mel Gorman wrote: IOWs, using sync_file_range() does not avoid the need to fsync() a file for data integrity purposes... I belive the PG community understand

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Dave Chinner
On Tue, Jan 14, 2014 at 03:03:39PM -0800, Kevin Grittner wrote: > Dave Chinner write: > > > Essentially, changing dirty_background_bytes, dirty_bytes and > > dirty_expire_centiseconds to be much smaller should make the > > kernel start writeback much sooner and so you shouldn't have to > > limit

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread James Bottomley
On Tue, 2014-01-14 at 15:09 -0500, Robert Haas wrote: > On Tue, Jan 14, 2014 at 3:00 PM, James Bottomley > wrote: > >> Doesn't sound exactly like what I had in mind. What I was suggesting > >> is an analogue of read() that, if it reads full pages of data to a > >> page-aligned address, shares the

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Dave Chinner
On Tue, Jan 14, 2014 at 11:40:38AM -0800, Kevin Grittner wrote: > Robert Haas wrote: > > Jan Kara wrote: > > > >> Just to get some idea about the sizes - how large are the > >> checkpoints we are talking about that cause IO stalls? > > > > Big. > > To quantify that, in a production setting we we

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Dave Chinner
On Tue, Jan 14, 2014 at 09:40:48AM -0500, Robert Haas wrote: > On Mon, Jan 13, 2014 at 5:26 PM, Mel Gorman wrote: > >> Amen to that. Actually, I think NUMA can be (mostly?) fixed by > >> setting zone_reclaim_mode; is there some other problem besides that? > > > > Really? > > > > zone_reclaim_mode

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread James Bottomley
On Tue, 2014-01-14 at 12:39 -0500, Robert Haas wrote: > On Tue, Jan 14, 2014 at 12:20 PM, James Bottomley > wrote: > > On Tue, 2014-01-14 at 15:15 -0200, Claudio Freire wrote: > >> On Tue, Jan 14, 2014 at 2:12 PM, Robert Haas wrote: > >> > In terms of avoiding double-buffering, here's my thought

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Kevin Grittner
Dave Chinner write: > Essentially, changing dirty_background_bytes, dirty_bytes and > dirty_expire_centiseconds to be much smaller should make the > kernel start writeback much sooner and so you shouldn't have to > limit the amount of buffers the application has to prevent major > fsync triggered

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Kevin Grittner
James Bottomley wrote: >> We start by creating a chunk of shared memory that all processes >> (we do not use threads) will have mapped at a common address, >> and we read() and write() into that chunk. > > Yes, that's what I was thinking: it's a cache.  About how many > files comprise this cache?

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Kevin Grittner
I wrote: > to avoid write gluts it must often be limited to 1GB to 1GB. That should have been "1GB to 2GB." -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Kevin Grittner
James Bottomley wrote: > About how many files comprise this cache?  Are you thinking it's > too difficult for every process to map the files? The shared_buffers area can be mapping anywhere from about 200 files to millions of files, representing a total space of about 6MB on the low end to over

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Robert Haas
On Tue, Jan 14, 2014 at 3:00 PM, James Bottomley wrote: >> Doesn't sound exactly like what I had in mind. What I was suggesting >> is an analogue of read() that, if it reads full pages of data to a >> page-aligned address, shares the data with the buffer cache until it's >> first written instead

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Kevin Grittner
Robert Haas wrote: > Jan Kara wrote: > >> Just to get some idea about the sizes - how large are the >> checkpoints we are talking about that cause IO stalls? > > Big. To quantify that, in a production setting we were seeing pauses of up to two minutes with shared_buffers set to 8GB and default d

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote: > I dunno what a typical checkpoint size is but I don't think you'll be > exaggerating much if you imagine that everything that could possibly > be dirty is. This is not uncommon for us, at least: checkpoint complete: wrote 425844 buffers (20.3%); 0 tr

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Jan Kara
On Tue 14-01-14 10:04:16, Robert Haas wrote: > On Tue, Jan 14, 2014 at 5:00 AM, Jan Kara wrote: > > I thought that instead of injecting pages into pagecache for aging as you > > describe in 3), you would mark pages as volatile (i.e. for reclaim by > > kernel) through vrange() syscall. Next time yo

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Robert Haas
On Tue, Jan 14, 2014 at 1:37 PM, Jan Kara wrote: > Just to get some idea about the sizes - how large are the checkpoints we > are talking about that cause IO stalls? Big. Potentially, we might have dirtied all of shared_buffers and then started evicting pages from there to the OS buffer pool and

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Jan Kara
On Tue 14-01-14 06:42:43, Kevin Grittner wrote: > First off, I want to give a +1 on everything in the recent posts > from Heikki and Hannu. > > Jan Kara wrote: > > > Now the aging of pages marked as volatile as it is currently > > implemented needn't be perfect for your needs but you still have

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Stephen Frost
* Claudio Freire (klaussfre...@gmail.com) wrote: > On Tue, Jan 14, 2014 at 2:17 PM, Robert Haas wrote: > > I don't know either. I wasn't thinking so much that it would save CPU > > time as that it would save memory. Consider a system with 32GB of > > RAM. If you set shared_buffers=8GB, then in

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Claudio Freire
On Tue, Jan 14, 2014 at 2:39 PM, Robert Haas wrote: > On Tue, Jan 14, 2014 at 12:20 PM, James Bottomley > wrote: >> On Tue, 2014-01-14 at 15:15 -0200, Claudio Freire wrote: >>> On Tue, Jan 14, 2014 at 2:12 PM, Robert Haas wrote: >>> > In terms of avoiding double-buffering, here's my thought afte

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Claudio Freire
On Tue, Jan 14, 2014 at 2:17 PM, Robert Haas wrote: > On Tue, Jan 14, 2014 at 12:15 PM, Claudio Freire > wrote: >> On Tue, Jan 14, 2014 at 2:12 PM, Robert Haas wrote: >>> In terms of avoiding double-buffering, here's my thought after reading >>> what's been written so far. Suppose we read a pa

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Jeff Janes
On Mon, Jan 13, 2014 at 6:44 PM, Dave Chinner wrote: > On Tue, Jan 14, 2014 at 02:26:25AM +0100, Andres Freund wrote: > > On 2014-01-13 17:13:51 -0800, James Bottomley wrote: > > > a file into a user provided buffer, thus obtaining a page cache entry > > > and a copy in their userspace buffer, th

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread James Bottomley
On Tue, 2014-01-14 at 10:39 -0500, Tom Lane wrote: > James Bottomley writes: > > The current mechanism for coherency between a userspace cache and the > > in-kernel page cache is mmap ... that's the only way you get the same > > page in both currently. > > Right. > > > glibc used to have an impl

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread James Bottomley
On Tue, 2014-01-14 at 15:15 -0200, Claudio Freire wrote: > On Tue, Jan 14, 2014 at 2:12 PM, Robert Haas wrote: > > > > In terms of avoiding double-buffering, here's my thought after reading > > what's been written so far. Suppose we read a page into our buffer > > pool. Until the page is clean,

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread James Bottomley
On Tue, 2014-01-14 at 11:48 -0500, Robert Haas wrote: > On Tue, Jan 14, 2014 at 11:44 AM, James Bottomley > wrote: > > No, I'm sorry, that's never going to be possible. No user space > > application has all the facts. If we give you an interface to force > > unconditional holding of dirty pages

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Tom Lane
Robert Haas writes: > On Tue, Jan 14, 2014 at 11:57 AM, James Bottomley > wrote: >> No, I do ... you mean the order of write out, if we have to do it, is >> important. In the rest of the kernel, we do this with barriers which >> causes ordered grouping of I/O chunks. If we could force a similar

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Robert Haas
On Tue, Jan 14, 2014 at 12:20 PM, James Bottomley wrote: > On Tue, 2014-01-14 at 15:15 -0200, Claudio Freire wrote: >> On Tue, Jan 14, 2014 at 2:12 PM, Robert Haas wrote: >> > In terms of avoiding double-buffering, here's my thought after reading >> > what's been written so far. Suppose we read

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Kevin Grittner
James Bottomley wrote: > you mean the order of write out, if we have to do it, is > important.  In the rest of the kernel, we do this with barriers > which causes ordered grouping of I/O chunks.  If we could force a > similar ordering in the writeout code, is that enough? Unless it can be betwee

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Hannu Krosing
On 01/14/2014 05:44 PM, James Bottomley wrote: > On Tue, 2014-01-14 at 10:39 -0500, Tom Lane wrote: >> James Bottomley writes: >>> The current mechanism for coherency between a userspace cache and the >>> in-kernel page cache is mmap ... that's the only way you get the same >>> page in both curren

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Kevin Grittner
Claudio Freire wrote: > Robert Haas wrote: >> James Bottomley wrote: >>> I don't understand why this has to be absolute: if you advise >>> us to hold the pages dirty and we do up until it becomes a >>> choice to hold on to the pages or to thrash the system into a >>> livelock, why would you eve

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Robert Haas
On Tue, Jan 14, 2014 at 12:15 PM, Claudio Freire wrote: > On Tue, Jan 14, 2014 at 2:12 PM, Robert Haas wrote: >> In terms of avoiding double-buffering, here's my thought after reading >> what's been written so far. Suppose we read a page into our buffer >> pool. Until the page is clean, it woul

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Claudio Freire
On Tue, Jan 14, 2014 at 2:12 PM, Robert Haas wrote: > > In terms of avoiding double-buffering, here's my thought after reading > what's been written so far. Suppose we read a page into our buffer > pool. Until the page is clean, it would be ideal for the mapping to > be shared between the buffer

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Robert Haas
On Tue, Jan 14, 2014 at 12:12 PM, Robert Haas wrote: > In terms of avoiding double-buffering, here's my thought after reading > what's been written so far. Suppose we read a page into our buffer > pool. Until the page is clean, it would be ideal for the mapping to Correction: "For so long as th

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Robert Haas
On Tue, Jan 14, 2014 at 11:57 AM, James Bottomley wrote: > On Tue, 2014-01-14 at 11:48 -0500, Robert Haas wrote: >> On Tue, Jan 14, 2014 at 11:44 AM, James Bottomley >> wrote: >> > No, I'm sorry, that's never going to be possible. No user space >> > application has all the facts. If we give you

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Heikki Linnakangas
On 01/14/2014 06:08 PM, Tom Lane wrote: Trond Myklebust writes: On Jan 14, 2014, at 10:39, Tom Lane wrote: "Don't be aggressive" isn't good enough. The prohibition on early write has to be absolute, because writing a dirty page before we've done whatever else we need to do results in a corru

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Claudio Freire
On Tue, Jan 14, 2014 at 1:48 PM, Robert Haas wrote: > On Tue, Jan 14, 2014 at 11:44 AM, James Bottomley > wrote: >> No, I'm sorry, that's never going to be possible. No user space >> application has all the facts. If we give you an interface to force >> unconditional holding of dirty pages in c

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-14 Thread Robert Haas
On Tue, Jan 14, 2014 at 11:44 AM, James Bottomley wrote: > No, I'm sorry, that's never going to be possible. No user space > application has all the facts. If we give you an interface to force > unconditional holding of dirty pages in core you'll livelock the system > eventually because you made

  1   2   >