Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-09 Thread Amit Kapila
-Original Message- From: Robert Haas [mailto:robertmh...@gmail.com] Sent: Tuesday, April 09, 2013 9:28 AM To: Amit Kapila Cc: Greg Smith; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Page replacement algorithm in buffer cache On Fri, Apr 5, 2013 at 11:08 PM, Amit Kapila

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-08 Thread Robert Haas
On Fri, Apr 5, 2013 at 11:08 PM, Amit Kapila amit.kap...@huawei.com wrote: I still have one more doubt, consider the below scenario for cases when we Invalidate buffers during moving to freelist v/s just move to freelist Backend got the buffer from freelist for a request of page-9 (number 9

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-05 Thread Robert Haas
On Fri, Apr 5, 2013 at 1:12 AM, Amit Kapila amit.kap...@huawei.com wrote: If we just put it to freelist, then next time if it get allocated directly from bufhash table, then who will remove it from freelist or do you think that, in BufferAlloc, if it gets from bufhash table, then it should

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-05 Thread Amit Kapila
On Saturday, April 06, 2013 12:38 AM Robert Haas wrote: On Fri, Apr 5, 2013 at 1:12 AM, Amit Kapila amit.kap...@huawei.com wrote: If we just put it to freelist, then next time if it get allocated directly from bufhash table, then who will remove it from freelist or do you think that, in

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-04 Thread Amit Kapila
On Thursday, April 04, 2013 7:19 AM Greg Smith wrote: On 4/2/13 11:54 AM, Robert Haas wrote: But, having said that, I still think the best idea is what Andres proposed, which pretty much matches my own thoughts: the bgwriter needs to populate the free list, so that buffer allocations don't

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-04 Thread Robert Haas
On Wed, Apr 3, 2013 at 9:49 PM, Greg Smith g...@2ndquadrant.com wrote: On 4/2/13 11:54 AM, Robert Haas wrote: But, having said that, I still think the best idea is what Andres proposed, which pretty much matches my own thoughts: the bgwriter needs to populate the free list, so that buffer

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-04 Thread Amit Kapila
On Thursday, April 04, 2013 6:12 PM Robert Haas wrote: On Wed, Apr 3, 2013 at 9:49 PM, Greg Smith g...@2ndquadrant.com wrote: On 4/2/13 11:54 AM, Robert Haas wrote: But, having said that, I still think the best idea is what Andres proposed, which pretty much matches my own thoughts: the

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-03 Thread Robert Haas
On Tue, Apr 2, 2013 at 1:20 PM, Andres Freund and...@2ndquadrant.com wrote: On 2013-04-02 12:56:56 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2013-04-02 12:22:03 -0400, Tom Lane wrote: I agree in general, though I'm not sure the bgwriter process can reasonably

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-03 Thread Greg Stark
On Wed, Apr 3, 2013 at 3:00 PM, Robert Haas robertmh...@gmail.com wrote: The main hesitation I've had about actually implementing such a scheme is that I find it a bit unappealing to have a background process dedicated to just this. But maybe it could be combined with some of the other ideas

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-03 Thread Tom Lane
Greg Stark st...@mit.edu writes: I'm still skeptical about the idea of a freelist. That just seems like a terrible point of contention. However perhaps that's because I'm picturing an LRU linked list. Perhaps the right thing is to maintain a pool of buffers in some less contention-prone data

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-03 Thread Ants Aasma
On Thu, Apr 4, 2013 at 3:41 AM, Tom Lane t...@sss.pgh.pa.us wrote: I think the original vision of the clock sweep algorithm included the idea that different backends could be running the sweep over different parts of the buffer ring concurrently. If we could get rid of the need to serialize

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-03 Thread Greg Smith
On 4/2/13 11:54 AM, Robert Haas wrote: But, having said that, I still think the best idea is what Andres proposed, which pretty much matches my own thoughts: the bgwriter needs to populate the free list, so that buffer allocations don't have to wait for linear scans of the buffer array. I was

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Andres Freund
On 2013-04-01 17:56:19 -0500, Jim Nasby wrote: On 3/23/13 7:41 AM, Ants Aasma wrote: Yes, having bgwriter do the actual cleaning up seems like a good idea. The whole bgwriter infrastructure will need some serious tuning. There are many things that could be shifted to background if we knew it

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Robert Haas
On Tue, Apr 2, 2013 at 6:32 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-04-01 17:56:19 -0500, Jim Nasby wrote: On 3/23/13 7:41 AM, Ants Aasma wrote: Yes, having bgwriter do the actual cleaning up seems like a good idea. The whole bgwriter infrastructure will need some serious

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Robert Haas
On Tue, Apr 2, 2013 at 1:53 AM, Merlin Moncure mmonc...@gmail.com wrote: That seems pretty unlikely because of A sheer luck of hitting that page for the dropout (if your buffer count is N the chances of losing it would seem to be 1/N at most) and B highly used pages are much more likely to be

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Merlin Moncure
On Tue, Apr 2, 2013 at 9:55 AM, Robert Haas robertmh...@gmail.com wrote: On Tue, Apr 2, 2013 at 1:53 AM, Merlin Moncure mmonc...@gmail.com wrote: That seems pretty unlikely because of A sheer luck of hitting that page for the dropout (if your buffer count is N the chances of losing it would

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Robert Haas
On Tue, Apr 2, 2013 at 11:32 AM, Merlin Moncure mmonc...@gmail.com wrote: That's a very fair point, although not being able to evict pinned buffers is a highly mitigating aspect. Also CLOG is a different beast entirely -- it's much more dense (2 bits!) vs a tuple so a single page can a lot of

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: But, having said that, I still think the best idea is what Andres proposed, which pretty much matches my own thoughts: the bgwriter needs to populate the free list, so that buffer allocations don't have to wait for linear scans of the buffer array.

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Andres Freund
On 2013-04-02 11:54:32 -0400, Robert Haas wrote: On Tue, Apr 2, 2013 at 11:32 AM, Merlin Moncure mmonc...@gmail.com wrote: That's a very fair point, although not being able to evict pinned buffers is a highly mitigating aspect. Also CLOG is a different beast entirely -- it's much more

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Andres Freund
On 2013-04-02 12:22:03 -0400, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: But, having said that, I still think the best idea is what Andres proposed, which pretty much matches my own thoughts: the bgwriter needs to populate the free list, so that buffer allocations don't have

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: On 2013-04-02 12:22:03 -0400, Tom Lane wrote: I agree in general, though I'm not sure the bgwriter process can reasonably handle this need along with what it's already supposed to be doing. We may need another background process that is just

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Andres Freund
On 2013-04-02 12:56:56 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2013-04-02 12:22:03 -0400, Tom Lane wrote: I agree in general, though I'm not sure the bgwriter process can reasonably handle this need along with what it's already supposed to be doing. We may

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Greg Stark
I'm confused by this thread. We *used* to maintain an LRU. The whole reason for the clock-sweep algorithm is precisely to avoid maintaining a linked list of least recently used buffers since the head of that list is a point of contention. -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Andres Freund
On 2013-04-02 18:26:23 +0100, Greg Stark wrote: I'm confused by this thread. We *used* to maintain an LRU. The whole reason for the clock-sweep algorithm is precisely to avoid maintaining a linked list of least recently used buffers since the head of that list is a point of contention. I

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Atri Sharma
On Tue, Apr 2, 2013 at 9:24 PM, Robert Haas robertmh...@gmail.com wrote: One thought I had for fiddling with usage_count is to make it grow additively (x = x + 1) and decay exponentially (x = x 1). I'm not sure the idea is any good, but one problem with the current system is that it's

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Merlin Moncure
On Tue, Apr 2, 2013 at 12:50 PM, Atri Sharma atri.j...@gmail.com wrote: On Tue, Apr 2, 2013 at 9:24 PM, Robert Haas robertmh...@gmail.com wrote: One thought I had for fiddling with usage_count is to make it grow additively (x = x + 1) and decay exponentially (x = x 1). I'm not sure the idea

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-02 Thread Atri Sharma
Sent from my iPad On 02-Apr-2013, at 23:41, Merlin Moncure mmonc...@gmail.com wrote: On Tue, Apr 2, 2013 at 12:50 PM, Atri Sharma atri.j...@gmail.com wrote: On Tue, Apr 2, 2013 at 9:24 PM, Robert Haas robertmh...@gmail.com wrote: One thought I had for fiddling with usage_count is to make

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Merlin Moncure
On Sun, Mar 31, 2013 at 1:27 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Friday, March 22, 2013, Ants Aasma wrote: On Fri, Mar 22, 2013 at 10:22 PM, Merlin Moncure mmonc...@gmail.com wrote: well if you do a non-locking test first you could at least avoid some cases (and, if you get the

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Robert Haas
On Mon, Apr 1, 2013 at 9:28 AM, Merlin Moncure mmonc...@gmail.com wrote: That is one of multiple issues. Contention on the BufFreelistLock is another one. I agree that usage_count maintenance is unlikely to become a bottleneck unless one or both of those is fixed first (and maybe not even

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Merlin Moncure
On Mon, Apr 1, 2013 at 10:03 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Apr 1, 2013 at 9:28 AM, Merlin Moncure mmonc...@gmail.com wrote: That is one of multiple issues. Contention on the BufFreelistLock is another one. I agree that usage_count maintenance is unlikely to become a

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Bruce Momjian
On Mon, Apr 1, 2013 at 11:55:07AM -0500, Merlin Moncure wrote: In fact, BufFreelistLock is really misnamed, because for the most part, the free list as we implement is going to be empty. What the BufFreelistLock is really doing is serializing the process of scanning for a free buffer. I

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Merlin Moncure
On Mon, Apr 1, 2013 at 3:32 PM, Bruce Momjian br...@momjian.us wrote: On Mon, Apr 1, 2013 at 11:55:07AM -0500, Merlin Moncure wrote: In fact, BufFreelistLock is really misnamed, because for the most part, the free list as we implement is going to be empty. What the BufFreelistLock is

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Andres Freund
On 2013-04-01 08:28:13 -0500, Merlin Moncure wrote: On Sun, Mar 31, 2013 at 1:27 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Friday, March 22, 2013, Ants Aasma wrote: On Fri, Mar 22, 2013 at 10:22 PM, Merlin Moncure mmonc...@gmail.com wrote: well if you do a non-locking test first

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Merlin Moncure
On Mon, Apr 1, 2013 at 4:09 PM, Andres Freund and...@2ndquadrant.com wrote: On 2013-04-01 08:28:13 -0500, Merlin Moncure wrote: On Sun, Mar 31, 2013 at 1:27 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Friday, March 22, 2013, Ants Aasma wrote: On Fri, Mar 22, 2013 at 10:22 PM, Merlin

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Jim Nasby
On 3/23/13 7:41 AM, Ants Aasma wrote: On Sat, Mar 23, 2013 at 6:04 AM, Jim Nasby j...@nasby.net wrote: Partitioned clock sweep strikes me as a bad idea... you could certainly get unlucky and end up with a lot of hot stuff in one partition. Surely that is not worse than having everything in a

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Jim Nasby
On 4/1/13 4:55 PM, Merlin Moncure wrote: On Mon, Apr 1, 2013 at 4:09 PM, Andres Freundand...@2ndquadrant.com wrote: On 2013-04-01 08:28:13 -0500, Merlin Moncure wrote: On Sun, Mar 31, 2013 at 1:27 PM, Jeff Janesjeff.ja...@gmail.com wrote: On Friday, March 22, 2013, Ants Aasma wrote: On

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Jim Nasby
On 3/23/13 4:43 AM, Amit Kapila wrote: I have tried one of the idea's : Adding the buffers background writer finds reusable to freelist. http://www.postgresql.org/message-id/6C0B27F7206C9E4CA54AE035729E9C382852FF97@szxeml509-mbs This can reduce the clock swipe as it can find buffers from

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Atri Sharma
I don't be believe there is any reasonable argument that sitting and spinning while holding the BufFreelistLock is a good idea. I completely agree. The idea of spinning for a lock while already inside a lock seems like a source of a hit to performance. Regards, Atri -- Regards, Atri

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Merlin Moncure
On Mon, Apr 1, 2013 at 6:09 PM, Jim Nasby j...@nasby.net wrote: On 4/1/13 4:55 PM, Merlin Moncure wrote: On Mon, Apr 1, 2013 at 4:09 PM, Andres Freundand...@2ndquadrant.com wrote: On 2013-04-01 08:28:13 -0500, Merlin Moncure wrote: On Sun, Mar 31, 2013 at 1:27 PM, Jeff

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-01 Thread Amit Kapila
-Original Message- From: Jim Nasby [mailto:j...@nasby.net] Sent: Tuesday, April 02, 2013 4:43 AM To: Amit Kapila Cc: 'Ants Aasma'; 'Merlin Moncure'; 'Tom Lane'; 'Atri Sharma'; 'Greg Stark'; 'PostgreSQL-development' Subject: Re: [HACKERS] Page replacement algorithm in buffer cache

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-31 Thread Jeff Janes
On Friday, March 22, 2013, Ants Aasma wrote: On Fri, Mar 22, 2013 at 10:22 PM, Merlin Moncure mmonc...@gmail.comjavascript:; wrote: well if you do a non-locking test first you could at least avoid some cases (and, if you get the answer wrong, so what?) by jumping to the next buffer

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-26 Thread Bruce Momjian
On Fri, Mar 22, 2013 at 06:06:18PM +, Greg Stark wrote: On Fri, Mar 22, 2013 at 2:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: And we definitely looked at ARC We didn't just look at it. At least one release used it. Then patent issues were raised (and I think the implementation had some

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-26 Thread Bruce Momjian
On Fri, Mar 22, 2013 at 04:16:18PM -0400, Tom Lane wrote: Merlin Moncure mmonc...@gmail.com writes: I think there is some very low hanging optimization fruit in the clock sweep loop. first and foremost, I see no good reason why when scanning pages we have to spin and wait on a buffer in

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-26 Thread Merlin Moncure
On Tue, Mar 26, 2013 at 11:40 AM, Bruce Momjian br...@momjian.us wrote: On Fri, Mar 22, 2013 at 04:16:18PM -0400, Tom Lane wrote: Merlin Moncure mmonc...@gmail.com writes: I think there is some very low hanging optimization fruit in the clock sweep loop. first and foremost, I see no good

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-24 Thread Atri Sharma
Perhaps this isn't the help you were looking for, but I spent a long time looking into this a few years ago. Then I stopped and decided to work on other things. I would recommend you do so too. Agreed. It seems that my concerns were not valid, and since you have already done some testing

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-24 Thread Atri Sharma
I'll have to take a look. Removing *all spinning* from from page allocation though feels like it might be worthwhile to test (got to give some bonus points for being a very local change and simple to implement). I wonder if with more shared buffers you tend to sweep more buffers per

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-24 Thread Ants Aasma
On Sun, Mar 24, 2013 at 9:29 AM, Atri Sharma atri.j...@gmail.com wrote: I'll have to take a look. Removing *all spinning* from from page allocation though feels like it might be worthwhile to test (got to give some bonus points for being a very local change and simple to implement). I wonder

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-24 Thread Greg Smith
On 3/22/13 8:45 AM, Ants Aasma wrote: However, I think the main issue isn't finding new algorithms that are better in some specific circumstances. The hard part is figuring out whether their performance is better in general. Right. The current page replacement method works as expected. Many

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-24 Thread Atri Sharma
On Sun, Mar 24, 2013 at 6:11 AM, Greg Smith g...@2ndquadrant.com wrote: On 3/22/13 8:45 AM, Ants Aasma wrote: However, I think the main issue isn't finding new algorithms that are better in some specific circumstances. The hard part is figuring out whether their performance is better in

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-24 Thread Atri Sharma
If we use the value calculated locally to decide on eviction, the highly used buffers where this is likely will get at least one clock sweep cycle of grace time. If they are indeed highly used it's likely that someone will manage to bump the usage_count in the meanwhile. If they are not hot,

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-24 Thread Ants Aasma
On Sun, Mar 24, 2013 at 7:03 PM, Atri Sharma atri.j...@gmail.com wrote: So, essentially, we decide locally which page to evict, then try to get a lock on the header only when we want to evict the victim page? Sounds like the contention for the header should go down considerably. Not exactly

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-23 Thread Jim Nasby
On 3/22/13 7:27 PM, Ants Aasma wrote: On Fri, Mar 22, 2013 at 10:22 PM, Merlin Moncure mmonc...@gmail.com wrote: well if you do a non-locking test first you could at least avoid some cases (and, if you get the answer wrong, so what?) by jumping to the next buffer immediately. if the non

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-23 Thread Amit Kapila
On Saturday, March 23, 2013 9:34 AM Jim Nasby wrote: On 3/22/13 7:27 PM, Ants Aasma wrote: On Fri, Mar 22, 2013 at 10:22 PM, Merlin Moncure mmonc...@gmail.com wrote: One other interesting idea I have seen is closeable scalable nonzero indication (C-SNZI) from scalable rw-locks [1]. The

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-23 Thread Ants Aasma
On Sat, Mar 23, 2013 at 6:29 AM, Atri Sharma atri.j...@gmail.com wrote: One way to distribute memory contention in case of spinlocks could be to utilize the fundamentals of NUMA architecture. Specifically, we can let the contending backends spin on local flags instead on the buffer header

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-23 Thread Ants Aasma
On Sat, Mar 23, 2013 at 6:04 AM, Jim Nasby j...@nasby.net wrote: Partitioned clock sweep strikes me as a bad idea... you could certainly get unlucky and end up with a lot of hot stuff in one partition. Surely that is not worse than having everything in a single partition. Given a decent

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-23 Thread Atri Sharma
Partitioned clock sweep strikes me as a bad idea... you could certainly get unlucky and end up with a lot of hot stuff in one partition. Another idea that'sbeen broughht up inthe past is to have something in the background keep a minimum number of buffers on the free list. That's how OS VM

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-23 Thread Jeff Janes
On Thu, Mar 21, 2013 at 9:51 PM, Atri Sharma atri.j...@gmail.com wrote: Hello all, Sorry if this is a naive question. I was going through Greg Smith's slides on buffer cache( http://www.westnet.com/~gsmith/content/postgresql/InsideBufferCache.pdf). When going through the page replacement

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-23 Thread Merlin Moncure
On Fri, Mar 22, 2013 at 7:27 PM, Ants Aasma a...@cybertec.at wrote: On Fri, Mar 22, 2013 at 10:22 PM, Merlin Moncure mmonc...@gmail.com wrote: well if you do a non-locking test first you could at least avoid some cases (and, if you get the answer wrong, so what?) by jumping to the next buffer

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-23 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes: I'm more convinced in the other direction, new pages should enter with 0 rather than with 1. I think that the argument that a new buffer needs to be given more of an opportunity to get used again is mostly bogus. IIRC, the argument for starting at 1 not

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-23 Thread Jeff Janes
On Fri, Mar 22, 2013 at 4:06 AM, Atri Sharma atri.j...@gmail.com wrote: Not yet, I figured this might be a problem and am designing test cases for the same. I would be glad for some help there please. Perhaps this isn't the help you were looking for, but I spent a long time looking into

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Atri Sharma
Sent from my iPad On 22-Mar-2013, at 11:28, Amit Kapila amit.kap...@huawei.com wrote: On Friday, March 22, 2013 10:22 AM Atri Sharma wrote: Hello all, Sorry if this is a naive question. I was going through Greg Smith's slides on buffer

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Amit Kapila
On Friday, March 22, 2013 12:00 PM Atri Sharma wrote: Sent from my iPad On 22-Mar-2013, at 11:28, Amit Kapila amit.kap...@huawei.com wrote: On Friday, March 22, 2013 10:22 AM Atri Sharma wrote: Hello all, Sorry if this is a naive question. I was going through Greg Smith's

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Atri Sharma
I think that if the initialization of USAGE_COUNT starts at the maximum allowed value instead of one, we can have a better solution to this problem. So what is your idea, if you start at maximum, what we will do for further accesses to it? I havent chalked out a detailed plan yet, but I

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Amit Kapila
On Friday, March 22, 2013 4:16 PM Atri Sharma wrote: I think that if the initialization of USAGE_COUNT starts at the maximum allowed value instead of one, we can have a better solution to this problem. So what is your idea, if you start at maximum, what we will do for further

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Atri Sharma
What would you do if the only young page has usage count zero during second sweep. UmmThe same approach we take when there is no page with usage count zero in a sweep in the current algorithm? I don't think introducing another factor along with usage count would do any much help.

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Amit Kapila
On Friday, March 22, 2013 4:36 PM Atri Sharma wrote: What would you do if the only young page has usage count zero during second sweep. UmmThe same approach we take when there is no page with usage count zero in a sweep in the current algorithm? It would give more priority to young

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Atri Sharma
On Fri, Mar 22, 2013 at 4:53 PM, Amit Kapila amit.kap...@huawei.com wrote: On Friday, March 22, 2013 4:36 PM Atri Sharma wrote: What would you do if the only young page has usage count zero during second sweep. UmmThe same approach we take when there is no page with usage count zero

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Ants Aasma
On Mar 22, 2013 12:46 PM, Atri Sharma atri.j...@gmail.com wrote: This is the one I think would work out best, add an age factor as to the time duration which an entry has spent in the cache along with its usage count. You might want to check out the LIRS cache replacement algorithm [1]. That

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Atri Sharma
However, I think the main issue isn't finding new algorithms that are better in some specific circumstances. The hard part is figuring out whether their performance is better in general. My idea was to create a patch to capture page pinning traffic from PostgreSQL (maybe stream out into a

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Tom Lane
Ants Aasma a...@cybertec.at writes: You might want to check out the LIRS cache replacement algorithm [1]. That algorithm tries to estimate least frequently used instead of least recently used. Mysql uses it for their buffer replacement policy. There is also a clock sweep based approximation

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Greg Stark
On Fri, Mar 22, 2013 at 2:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: And we definitely looked at ARC We didn't just look at it. At least one release used it. Then patent issues were raised (and I think the implementation had some contention problems). -- greg -- Sent via pgsql-hackers

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Atri Sharma
On Fri, Mar 22, 2013 at 11:36 PM, Greg Stark st...@mit.edu wrote: On Fri, Mar 22, 2013 at 2:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: And we definitely looked at ARC We didn't just look at it. At least one release used it. Then patent issues were raised (and I think the implementation had

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Merlin Moncure
On Fri, Mar 22, 2013 at 1:13 PM, Atri Sharma atri.j...@gmail.com wrote: On Fri, Mar 22, 2013 at 11:36 PM, Greg Stark st...@mit.edu wrote: On Fri, Mar 22, 2013 at 2:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: And we definitely looked at ARC We didn't just look at it. At least one release used

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes: On Fri, Mar 22, 2013 at 1:13 PM, Atri Sharma atri.j...@gmail.com wrote: What is the general thinking? Is it time to start testing again and thinking about improvements to the current algorithm? well, what problem are you trying to solve exactly? the

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Merlin Moncure
On Fri, Mar 22, 2013 at 2:52 PM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: On Fri, Mar 22, 2013 at 1:13 PM, Atri Sharma atri.j...@gmail.com wrote: What is the general thinking? Is it time to start testing again and thinking about improvements to the current

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes: I think there is some very low hanging optimization fruit in the clock sweep loop. first and foremost, I see no good reason why when scanning pages we have to spin and wait on a buffer in order to pedantically adjust usage_count. some simple

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Merlin Moncure
On Fri, Mar 22, 2013 at 3:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: I think there is some very low hanging optimization fruit in the clock sweep loop. first and foremost, I see no good reason why when scanning pages we have to spin and wait on a

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Ants Aasma
On Fri, Mar 22, 2013 at 10:22 PM, Merlin Moncure mmonc...@gmail.com wrote: well if you do a non-locking test first you could at least avoid some cases (and, if you get the answer wrong, so what?) by jumping to the next buffer immediately. if the non locking test comes good, only then do you

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-22 Thread Atri Sharma
Moreover, if the buffer happens to miss a decrement due to a data race, there's a good chance that the buffer is heavily used and wouldn't need to be evicted soon anyway. (if you arrange it to be a read-test-inc/dec-store operation then you will never go out of bounds) However, clocksweep

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-21 Thread Amit Kapila
On Friday, March 22, 2013 10:22 AM Atri Sharma wrote: Hello all, Sorry if this is a naive question. I was going through Greg Smith's slides on buffer cache(http://www.westnet.com/~gsmith/content/postgresql/InsideBufferCac he.pdf). When going through the page replacement algorithm that we