Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-21 Thread Robert Haas
On Mon, Apr 20, 2015 at 2:53 PM, Jim Nasby jim.na...@bluetreble.com wrote: I think that would help, but it still leaves user backends trying to advance the clock, which is quite painful. Has anyone tested running the clock in the background? We need a wiki page with all the ideas that have been

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-20 Thread Peter Geoghegan
On Tue, Apr 14, 2015 at 7:02 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Apr 14, 2015 at 6:22 PM, Peter Geoghegan p...@heroku.com wrote: Why is that good? We did discuss this before. I've recapped some of what I believe to be the most salient points below. I think that people were

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-20 Thread Merlin Moncure
On Mon, Apr 20, 2015 at 9:56 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Apr 15, 2015 at 5:00 PM, Martijn van Oosterhout klep...@svana.org wrote: I've been following this thread from the side with interest and got twigged by the point about loss of information. If you'd like better

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-20 Thread Robert Haas
On Wed, Apr 15, 2015 at 5:00 PM, Martijn van Oosterhout klep...@svana.org wrote: I've been following this thread from the side with interest and got twigged by the point about loss of information. If you'd like better information about relative ages, you can acheive this by raising the cap on

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-20 Thread Robert Haas
On Wed, Apr 15, 2015 at 5:06 PM, Greg Stark st...@mit.edu wrote: This is my point though (you're right that flushed isn't always the same as eviction but that's not the important point here). Right now we only demote when we consider buffers for eviction. But we promote when we pin buffers.

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-20 Thread Robert Haas
On Mon, Apr 20, 2015 at 11:00 AM, Merlin Moncure mmonc...@gmail.com wrote: Hmm, interesting point. It's possible that we'd still have problems with everything maxing out at 32 on some workloads, but at least it'd be a little harder to max out at 32 than at 5. Do we have any reproducible test

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-20 Thread Jim Nasby
On 4/20/15 11:11 AM, Robert Haas wrote: On Wed, Apr 15, 2015 at 5:06 PM, Greg Stark st...@mit.edu wrote: This is my point though (you're right that flushed isn't always the same as eviction but that's not the important point here). Right now we only demote when we consider buffers for eviction.

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-15 Thread Amit Kapila
On Wed, Apr 15, 2015 at 10:07 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Apr 15, 2015 at 12:15 AM, Amit Kapila amit.kapil...@gmail.com wrote: IIUC, this will allow us to increase usage count only when the buffer is touched by clocksweep to decrement the usage count. Yes. I

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-15 Thread Martijn van Oosterhout
On Wed, Apr 15, 2015 at 12:37:44AM -0400, Robert Haas wrote: I think such a solution will be good for the cases when many evictions needs to be performed to satisfy the workload, OTOH when there are not too many evictions that needs to be done, in such a case some of the buffers that are

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-15 Thread Greg Stark
On Wed, Apr 15, 2015 at 5:26 AM, Robert Haas robertmh...@gmail.com wrote: The way our cache works we promote when a buffer is accessed but we only demote when a buffer is flushed. We flush a lot less often than we touch buffers so it's not surprising that the cache ends up full of buffers that

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-14 Thread Robert Haas
On Tue, Apr 14, 2015 at 6:22 PM, Peter Geoghegan p...@heroku.com wrote: Why is that good? We did discuss this before. I've recapped some of what I believe to be the most salient points below. I think that people were all too quick to dismiss the idea of a wall time interval playing some role

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-14 Thread Amit Kapila
On Wed, Apr 15, 2015 at 2:55 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Apr 16, 2014 at 2:44 PM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: Anyways, I'm still curious if you can post similar numbers basing the throttling on gross allocation

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-14 Thread Robert Haas
On Tue, Apr 14, 2015 at 7:10 PM, Greg Stark st...@mit.edu wrote: The way the clock sweep algorithm is meant to be thought about is that it's an approximate lru. Each usage count corresponds to an ntile of the lru. So we don't know which buffer is least recently used but it must be in the set

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-14 Thread Robert Haas
On Wed, Apr 15, 2015 at 12:15 AM, Amit Kapila amit.kapil...@gmail.com wrote: IIUC, this will allow us to increase usage count only when the buffer is touched by clocksweep to decrement the usage count. Yes. I think such a solution will be good for the cases when many evictions needs to be

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-14 Thread Jim Nasby
On 4/14/15 5:22 PM, Peter Geoghegan wrote: As long as we're doing random brainstorming, I'd suggest looking at making clocksweep actually approximate LRU-K/LRU-2 (which, again, to be clear, my prototype did not do). The clocksweep could maintain statistics about the recency of the second-to-last

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-14 Thread Peter Geoghegan
On Tue, Apr 14, 2015 at 2:25 PM, Robert Haas robertmh...@gmail.com wrote: So, I was thinking about this a little bit more today, prodded by my coworker John Gorman. I'm wondering if we could drive this off of the clock sweep; that is, every time the clock sweep touches a buffer, its usage

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-14 Thread Robert Haas
On Wed, Apr 16, 2014 at 2:44 PM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: Anyways, I'm still curious if you can post similar numbers basing the throttling on gross allocation counts instead of time. Meaning: some number of buffer allocations has to have

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2015-04-14 Thread Greg Stark
I've been meaning to write this since PGConf and now isn't a great time since I'm on my phone but I think it's time. The way the clock sweep algorithm is meant to be thought about is that it's an approximate lru. Each usage count corresponds to an ntile of the lru. So we don't know which buffer

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-05-01 Thread Kevin Grittner
Jim Nasby j...@nasby.net wrote: In our case this could maybe be handled by simply not incrementing counts when there's no eviction... but I'm more a fan of separate polls/clocks, because that means you can do things like a LFU for active and an LRU for inactive. I have hesitated to mention

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-05-01 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote: each connection caused to be held in cache the last page at each level of the index. Apologies for ambiguous terminology there. To be clear: the most recently accessed page at each level of the index. -- Kevin Grittner EDB: http://www.enterprisedb.com

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-28 Thread Robert Haas
On Fri, Apr 18, 2014 at 11:46 AM, Greg Stark st...@mit.edu wrote: On Fri, Apr 18, 2014 at 4:14 PM, Robert Haas robertmh...@gmail.com wrote: I am a bit confused by this remark. In *any* circumstance when you evict you're incurring precisely one page fault I/O when the page is read back in.

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-28 Thread Robert Haas
On Mon, Apr 21, 2014 at 6:38 PM, Jim Nasby j...@nasby.net wrote: I feel that if there is no memory pressure, frankly it doesnt matter much about what gets out and what not. The case I am specifically targeting is when the clocksweep gets to move about a lot i.e. high memory pressure workloads.

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-28 Thread Peter Geoghegan
On Mon, Apr 28, 2014 at 6:02 AM, Robert Haas robertmh...@gmail.com wrote: Also true. But the problem is that it is very rarely, if ever, the case that all pages are *equally* hot. On a pgbench workload, for example, I'm very confident that while there's not really any cold data, the btree

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-28 Thread Peter Geoghegan
On Fri, Apr 25, 2014 at 10:45 AM, Peter Geoghegan p...@heroku.com wrote: I've now done a non-limited comparative benchmark of master against the patch (once again, with usage_count starting at 6, and BM_MAX_USAGE_COUNT at 30) with a Gaussian distribution. Once again, the distribution threshold

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-28 Thread Jim Nasby
On 4/28/14, 8:04 AM, Robert Haas wrote: On Mon, Apr 21, 2014 at 6:38 PM, Jim Nasby j...@nasby.net wrote: I feel that if there is no memory pressure, frankly it doesnt matter much about what gets out and what not. The case I am specifically targeting is when the clocksweep gets to move about a

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-25 Thread Peter Geoghegan
I've now done a non-limited comparative benchmark of master against the patch (once again, with usage_count starting at 6, and BM_MAX_USAGE_COUNT at 30) with a Gaussian distribution. Once again, the distribution threshold used was consistently 5.0, causing the patched pgbench to report for each

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-24 Thread Peter Geoghegan
On Mon, Apr 21, 2014 at 11:57 PM, Peter Geoghegan p...@heroku.com wrote: Here is a benchmark that is similar to my earlier one, but with a rate limit of 125 tps, to help us better characterize how the prototype patch helps performance:

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-22 Thread Peter Geoghegan
Here is a benchmark that is similar to my earlier one, but with a rate limit of 125 tps, to help us better characterize how the prototype patch helps performance: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/3-sec-delay-limit/ Again, these are 15 minute runs with unlogged tables

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-22 Thread Albe Laurenz
Jason Petersen wrote: Yes, we obviously want a virtual clock. Focusing on the use of gettimeofday seems silly to me: it was something quick for the prototype. The problem with the clocksweeps is they don’t actually track the progression of “time” within the PostgreSQL system. Would it

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-22 Thread Atri Sharma
On Tue, Apr 22, 2014 at 12:59 PM, Albe Laurenz laurenz.a...@wien.gv.atwrote: Jason Petersen wrote: Yes, we obviously want a virtual clock. Focusing on the use of gettimeofday seems silly to me: it was something quick for the prototype. The problem with the clocksweeps is they don’t

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-22 Thread Hannu Krosing
On 04/17/2014 10:39 PM, Andres Freund wrote: On 2014-04-17 13:33:27 -0700, Peter Geoghegan wrote: Just over 99.6% of pages (leaving aside the meta page) in the big 10 GB pgbench_accounts_pkey index are leaf pages. What is the depth of b-tree at this percentage ? Cheers Hannu -- Sent via

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-22 Thread Jim Nasby
On 4/21/14, 6:07 PM, David G Johnston wrote: Jim Nasby-2 wrote I feel that if there is no memory pressure, frankly it doesnt matter much about what gets out and what not. The case I am specifically targeting is when the clocksweep gets to move about a lot i.e. high memory pressure workloads. Of

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-22 Thread Peter Geoghegan
On Tue, Apr 22, 2014 at 2:03 AM, Hannu Krosing ha...@krosing.net wrote: What is the depth of b-tree at this percentage ? Well, this percentage of B-Tree pages that are leaf pages doesn't have much to do with the depth. The percentage seems very consistent for each B-Tree, irrespective of the

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Jim Nasby
On 4/15/14, 1:15 PM, Peter Geoghegan wrote: On Tue, Apr 15, 2014 at 9:30 AM, Merlin Moncuremmonc...@gmail.com wrote: There are many reports of improvement from lowering shared_buffers. The problem is that it tends to show up on complex production workloads and that there is no clear evidence

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Jim Nasby
On 4/18/14, 2:51 PM, Atri Sharma wrote: I feel that if there is no memory pressure, frankly it doesnt matter much about what gets out and what not. The case I am specifically targeting is when the clocksweep gets to move about a lot i.e. high memory pressure workloads. Of course, I may be

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Jim Nasby
On 4/16/14, 10:28 AM, Robert Haas wrote: Also, I think the scalability problems around buffer eviction are eminently solvable, and in particular I'm hopeful that Amit is going to succeed in solving them. Suppose we have a background process (whether the background writer or some other) that

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread David G Johnston
Jim Nasby-2 wrote I feel that if there is no memory pressure, frankly it doesnt matter much about what gets out and what not. The case I am specifically targeting is when the clocksweep gets to move about a lot i.e. high memory pressure workloads. Of course, I may be totally wrong here.

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Claudio Freire
On Mon, Apr 21, 2014 at 8:07 PM, David G Johnston david.g.johns...@gmail.com wrote: Jim Nasby-2 wrote I feel that if there is no memory pressure, frankly it doesnt matter much about what gets out and what not. The case I am specifically targeting is when the clocksweep gets to move about a lot

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Tom Lane
Jim Nasby j...@nasby.net writes: How *certain* are we that a single freelist lock (that actually ONLY protects the freelist) would be that big a deal? We used to have one. It was a big bottleneck --- and this was years ago, when the buffer manager was much less scalable than it is today.

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Peter Geoghegan
On Mon, Apr 21, 2014 at 5:28 PM, Tom Lane t...@sss.pgh.pa.us wrote: We used to have one. It was a big bottleneck --- and this was years ago, when the buffer manager was much less scalable than it is today. (IIRC, getting rid of a central lock was one of the main advantages of the current

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes: On Mon, Apr 21, 2014 at 5:28 PM, Tom Lane t...@sss.pgh.pa.us wrote: We used to have one. It was a big bottleneck --- and this was years ago, when the buffer manager was much less scalable than it is today. (IIRC, getting rid of a central lock was one of

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Peter Geoghegan
On Mon, Apr 21, 2014 at 5:50 PM, Tom Lane t...@sss.pgh.pa.us wrote: ARC *was* the predecessor algorithm. See commit 5d5087363. I believe that the main impetus for replacing ARC with clock sweep came from patent issues, though. It was a happy coincidence that clock sweep happened to be better

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes: On Mon, Apr 21, 2014 at 5:50 PM, Tom Lane t...@sss.pgh.pa.us wrote: ARC *was* the predecessor algorithm. See commit 5d5087363. I believe that the main impetus for replacing ARC with clock sweep came from patent issues, though. That was one issue, but

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Peter Geoghegan
On Mon, Apr 21, 2014 at 6:12 PM, Tom Lane t...@sss.pgh.pa.us wrote: Did you read the commit message I pointed to? Yes. (See also 4e8af8d27.) Oh, I wasn't actually aware of the fact that 2Q made it into the tree. I thought that the first commit message you referred to just referenced on-list

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Peter Geoghegan
On Mon, Apr 21, 2014 at 5:59 PM, Peter Geoghegan p...@heroku.com wrote: LRU-K, and 2Q have roughly the same advantages. I'm reasonably confident you can have the best of both worlds, or something closer to it. Having said that, a big part of what I'd like to accomplish here is to address the

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-19 Thread Atri Sharma
On Sat, Apr 19, 2014 at 3:37 AM, Bruce Momjian br...@momjian.us wrote: One thing that I discussed with Merlin offline and am now concerned about is how will the actual eviction work. We cannot traverse the entire list and then find all the buffers with refcount 0 and then do another

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-18 Thread Atri Sharma
On Fri, Apr 18, 2014 at 7:27 AM, Peter Geoghegan p...@heroku.com wrote: A way I have in mind about eviction policy is to introduce a way to have an ageing factor in each buffer and take the ageing factor into consideration when evicting a buffer. Consider a case where a table is pretty huge and

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-18 Thread Robert Haas
On Thu, Apr 17, 2014 at 5:00 PM, Greg Stark st...@mit.edu wrote: On Thu, Apr 17, 2014 at 10:18 AM, Robert Haas robertmh...@gmail.com wrote: Because all the usage counts are the same, the eviction at this point is completely indiscriminate. We're just as likely to kick out a btree root page or

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-18 Thread Greg Stark
On Fri, Apr 18, 2014 at 4:14 PM, Robert Haas robertmh...@gmail.com wrote: I am a bit confused by this remark. In *any* circumstance when you evict you're incurring precisely one page fault I/O when the page is read back in. That doesn't mean that the choice of which page to evict is

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-18 Thread Bruce Momjian
On Fri, Apr 18, 2014 at 04:46:31PM +0530, Atri Sharma wrote: This can be changed by introducing an ageing factor that sees how much time the current buffer has spend in shared buffers. If the time that the buffer has spent is large enough (relatively) and it is not hot currently, that means

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-18 Thread Atri Sharma
On Sat, Apr 19, 2014 at 1:07 AM, Bruce Momjian br...@momjian.us wrote: On Fri, Apr 18, 2014 at 04:46:31PM +0530, Atri Sharma wrote: This can be changed by introducing an ageing factor that sees how much time the current buffer has spend in shared buffers. If the time that the buffer has

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-18 Thread Jason Petersen
On Apr 18, 2014, at 1:51 PM, Atri Sharma atri.j...@gmail.com wrote: Counting clock sweeps is an intersting idea. I think one concern was tracking hot buffers in cases where there is no memory pressure, and hence the clock sweep isn't running --- I am not sure how this would help in that

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-18 Thread Atri Sharma
Yes, we obviously want a virtual clock. Focusing on the use of gettimeofday seems silly to me: it was something quick for the prototype. The problem with the clocksweeps is they don’t actually track the progression of “time” within the PostgreSQL system. What’s wrong with using a transaction

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-18 Thread Peter Geoghegan
On Fri, Apr 18, 2014 at 1:11 PM, Jason Petersen ja...@citusdata.com wrote: Yes, we obviously want a virtual clock. Focusing on the use of gettimeofday seems silly to me: it was something quick for the prototype. The gettimeofday() call doesn't need to happen in a tight loop. It can be

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-18 Thread Bruce Momjian
On Sat, Apr 19, 2014 at 01:21:29AM +0530, Atri Sharma wrote: I feel that if there is no memory pressure, frankly it doesnt matter much about what gets out and what not. The case I am specifically targeting is when the clocksweep gets to move about a lot i.e. high memory pressure workloads. Of

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Greg Stark
On Wed, Apr 16, 2014 at 12:44 AM, Robert Haas robertmh...@gmail.com wrote: This isn't a fundamental property of the usage-count idea; it's an artifact of the fact that usage count decreases are tied to eviction pressure rather than access pressure. For example, suppose we made a rule that if

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Robert Haas
On Thu, Apr 17, 2014 at 9:40 AM, Greg Stark st...@mit.edu wrote: On Wed, Apr 16, 2014 at 12:44 AM, Robert Haas robertmh...@gmail.com wrote: This isn't a fundamental property of the usage-count idea; it's an artifact of the fact that usage count decreases are tied to eviction pressure rather

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Bruce Momjian
On Thu, Apr 17, 2014 at 10:18:43AM -0400, Robert Haas wrote: I also believe this to be the case on first principles and my own experiments. Suppose you have a workload that fits inside shared_buffers. All of the usage counts will converge to 5. Then, somebody accesses a table that is not

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Robert Haas
On Thu, Apr 17, 2014 at 10:32 AM, Bruce Momjian br...@momjian.us wrote: On Thu, Apr 17, 2014 at 10:18:43AM -0400, Robert Haas wrote: I also believe this to be the case on first principles and my own experiments. Suppose you have a workload that fits inside shared_buffers. All of the usage

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Bruce Momjian
On Thu, Apr 17, 2014 at 10:40:40AM -0400, Robert Haas wrote: On Thu, Apr 17, 2014 at 10:32 AM, Bruce Momjian br...@momjian.us wrote: On Thu, Apr 17, 2014 at 10:18:43AM -0400, Robert Haas wrote: I also believe this to be the case on first principles and my own experiments. Suppose you have

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Robert Haas
On Thu, Apr 17, 2014 at 10:48 AM, Bruce Momjian br...@momjian.us wrote: I understand now. If there is no memory pressure, every buffer gets the max usage count, and when a new buffer comes in, it isn't the max so it is swiftly removed until the clock sweep has time to decrement the old

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Andres Freund
On 2014-04-17 10:48:15 -0400, Bruce Momjian wrote: On Thu, Apr 17, 2014 at 10:40:40AM -0400, Robert Haas wrote: That can happen, but the real problem I was trying to get at is that when all the buffers get up to max usage count, they all appear equally important. But in reality they're

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Greg Stark
On Thu, Apr 17, 2014 at 10:18 AM, Robert Haas robertmh...@gmail.com wrote: Because all the usage counts are the same, the eviction at this point is completely indiscriminate. We're just as likely to kick out a btree root page or a visibility map page as we are to kick out a random heap page,

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Greg Stark
On Tue, Apr 15, 2014 at 7:30 PM, Peter Geoghegan p...@heroku.com wrote: Frankly, there doesn't need to be any research on this, because it's just common sense that probabilistically, leaf pages are much more useful than heap pages in servicing index scan queries if we assume a uniform

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote: several orders of magnitude more often. That's clearly bad. On systems that are not too heavily loaded it doesn't matter too much because we just fault the page right back in from the OS pagecache. Ehhh. No. If it's a hot page that we've been

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Greg Stark
On Thu, Apr 17, 2014 at 12:21 PM, Stephen Frost sfr...@snowman.net wrote: Ehhh. No. If it's a hot page that we've been holding in *our* cache long enough, the kernel will happily evict it as 'cold' from *its* cache, leading to... This is a whole nother problem. It is worrisome that we

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Peter Geoghegan
On Thu, Apr 17, 2014 at 9:21 AM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: several orders of magnitude more often. That's clearly bad. On systems that are not too heavily loaded it doesn't matter too much because we just fault the page right back in

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Stephen Frost
* Greg Stark (st...@mit.edu) wrote: On Thu, Apr 17, 2014 at 12:21 PM, Stephen Frost sfr...@snowman.net wrote: Ehhh. No. If it's a hot page that we've been holding in *our* cache long enough, the kernel will happily evict it as 'cold' from *its* cache, leading to... This is a whole

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Andres Freund
On 2014-04-17 21:44:47 +0300, Heikki Linnakangas wrote: On 04/17/2014 09:38 PM, Stephen Frost wrote: * Greg Stark (st...@mit.edu) wrote: On Thu, Apr 17, 2014 at 12:21 PM, Stephen Frost sfr...@snowman.net wrote: Ehhh. No. If it's a hot page that we've been holding in *our* cache long enough,

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote: Note that if we somehow come up with a page replacement algorithm that tends to evict pages that are in the OS cache, we have effectively solved the double buffering problem. When a page is cached in both caches, evicting it from one of them

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Heikki Linnakangas
On 04/17/2014 09:38 PM, Stephen Frost wrote: * Greg Stark (st...@mit.edu) wrote: On Thu, Apr 17, 2014 at 12:21 PM, Stephen Frost sfr...@snowman.net wrote: Ehhh. No. If it's a hot page that we've been holding in *our* cache long enough, the kernel will happily evict it as 'cold' from *its*

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Merlin Moncure
On Thu, Apr 17, 2014 at 1:48 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-04-17 21:44:47 +0300, Heikki Linnakangas wrote: On 04/17/2014 09:38 PM, Stephen Frost wrote: * Greg Stark (st...@mit.edu) wrote: On Thu, Apr 17, 2014 at 12:21 PM, Stephen Frost sfr...@snowman.net wrote: Ehhh.

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Peter Geoghegan
On Thu, Apr 17, 2014 at 11:53 AM, Merlin Moncure mmonc...@gmail.com wrote: No. but if you were very judicious, maybe you could hint the o/s (posix_fadvise) about pages that are likely to stay hot that you don't need them. Mitsumasa KONDO wrote a patch like that. I don't think the results were

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Stephen Frost
* Merlin Moncure (mmonc...@gmail.com) wrote: I doubt that's necessary though -- if the postgres caching algorithm improves such that there is a better tendency for hot pages to stay in s_b, Eventually the O/S will deschedule the page for something else that needs it. In other words,

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Merlin Moncure
On Thu, Apr 17, 2014 at 2:00 PM, Stephen Frost sfr...@snowman.net wrote: * Merlin Moncure (mmonc...@gmail.com) wrote: I doubt that's necessary though -- if the postgres caching algorithm improves such that there is a better tendency for hot pages to stay in s_b, Eventually the O/S will

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes: I wonder if it would help to actually tell the OS to read in buffers that we're *evicting*... On the general notion that if the OS already has them buffered then it's almost a no-op, and if it doesn't and it's actually a 'hot' buffer that we're gonna

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Stephen Frost
* Merlin Moncure (mmonc...@gmail.com) wrote: I don't think this would work unless we would keep some kind of tracking information on the page itself which seems not worth a write operation to do (maybe if the page is dirtied it could be snuck in there though...). IOW, it would only make

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: I wonder if it would help to actually tell the OS to read in buffers that we're *evicting*... On the general notion that if the OS already has them buffered then it's almost a no-op, and if it doesn't and it's

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Merlin Moncure
On Thu, Apr 17, 2014 at 2:16 PM, Stephen Frost sfr...@snowman.net wrote: * Merlin Moncure (mmonc...@gmail.com) wrote: I don't think this would work unless we would keep some kind of tracking information on the page itself which seems not worth a write operation to do (maybe if the page is

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Stephen Frost
On Thursday, April 17, 2014, Merlin Moncure mmonc...@gmail.com wrote: yeah -- the thing is, we are already too spendy already on supplemental write i/o (hint bits, visible bits, freezing, etc) and likely not worth it to throw something else on the pile unless the page is already dirty; the

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Merlin Moncure
On Thu, Apr 17, 2014 at 2:28 PM, Stephen Frost sfr...@snowman.net wrote: On Thursday, April 17, 2014, Merlin Moncure mmonc...@gmail.com wrote: yeah -- the thing is, we are already too spendy already on supplemental write i/o (hint bits, visible bits, freezing, etc) and likely not worth it

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Stephen Frost
On Thursday, April 17, 2014, Merlin Moncure mmonc...@gmail.com wrote: no -- I got you. My point was, that's a pure guess unless you base it on evidence recorded on the page itself. Without that evidence, (which requires writing) the operating is in a a better place to make that guess so it's

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Peter Geoghegan
On Thu, Apr 17, 2014 at 8:10 AM, Greg Stark st...@mit.edu wrote: I don't think common sense is compelling. I think you need to pin down exactly what it is about btree intermediate pages that the LRU isn't capturing and not just argue they're more useful. The LRU is already capturing which

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Andres Freund
On 2014-04-17 13:33:27 -0700, Peter Geoghegan wrote: Just over 99.6% of pages (leaving aside the meta page) in the big 10 GB pgbench_accounts_pkey index are leaf pages. That's a rather nice number. I knew it was big, but I'd have guessed it'd be a percent lower. Do you happen to have the same

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Peter Geoghegan
On Thu, Apr 17, 2014 at 1:39 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-04-17 13:33:27 -0700, Peter Geoghegan wrote: Just over 99.6% of pages (leaving aside the meta page) in the big 10 GB pgbench_accounts_pkey index are leaf pages. That's a rather nice number. I knew it was big,

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Peter Geoghegan
On Thu, Apr 17, 2014 at 1:33 PM, Peter Geoghegan p...@heroku.com wrote: I can't imagine that this is much of a problem in practice. Although I will add that not caching highly useful inner pages for the medium term, because that index isn't being used at all for 5 minutes probably is very bad.

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Greg Stark
On Thu, Apr 17, 2014 at 4:48 PM, Peter Geoghegan p...@heroku.com wrote: Although I will add that not caching highly useful inner pages for the medium term, because that index isn't being used at all for 5 minutes probably is very bad. Using the 4,828 buffers that it would take to store all the

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-17 Thread Peter Geoghegan
On Thu, Apr 17, 2014 at 6:50 PM, Greg Stark st...@mit.edu wrote: On Thu, Apr 17, 2014 at 4:48 PM, Peter Geoghegan p...@heroku.com wrote: Although I will add that not caching highly useful inner pages for the medium term, because that index isn't being used at all for 5 minutes probably is very

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Peter Geoghegan
On Tue, Apr 15, 2014 at 9:44 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Apr 14, 2014 at 1:11 PM, Peter Geoghegan p...@heroku.com wrote: In the past, various hackers have noted problems they've observed with this scheme. A common pathology is to see frantic searching for a victim

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Andres Freund
Hi, It's good to see focus on this - some improvements around s_b are sorely needed. On 2014-04-14 10:11:53 -0700, Peter Geoghegan wrote: 1) Throttles incrementation of usage_count temporally. It becomes impossible to increment usage_count for any given buffer more frequently than every 3

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Peter Geoghegan
On Wed, Apr 16, 2014 at 12:53 AM, Andres Freund and...@2ndquadrant.com wrote: I think this is unfortunately completely out of question. For one a gettimeofday() for every uffer pin will become a significant performance problem. Even the computation of the xact/stm start/stop timestamps shows

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Andres Freund
On 2014-04-16 01:58:23 -0700, Peter Geoghegan wrote: On Wed, Apr 16, 2014 at 12:53 AM, Andres Freund and...@2ndquadrant.com wrote: I think this is unfortunately completely out of question. For one a gettimeofday() for every uffer pin will become a significant performance problem. Even the

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Peter Geoghegan
On Wed, Apr 16, 2014 at 2:18 AM, Andres Freund and...@2ndquadrant.com wrote: *I* don't think any scheme that involves measuring the time around buffer pins is going to be acceptable. It's better than I say that now rather than when you've invested significant time into the approach, no? Well,

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Andres Freund
Hi, On 2014-04-16 02:57:54 -0700, Peter Geoghegan wrote: Why should I be the one with all the answers? Who said you need to be? The only thing I am saying is that I don't agree with some of your suggestions? I only responded to the thread now because downthread (in

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Robert Haas
On Wed, Apr 16, 2014 at 3:22 AM, Peter Geoghegan p...@heroku.com wrote: It's possible that I've misunderstood what you mean here, but do you really think it's likely that everything will be hot, in the event of using something like what I've sketched here? I think it's an important measure

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Ants Aasma
On Wed, Apr 16, 2014 at 7:44 AM, Robert Haas robertmh...@gmail.com wrote: I think that the basic problem here is that usage counts increase when buffers are referenced, but they decrease when buffers are evicted, and those two things are not in any necessary way connected to each other. In

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Merlin Moncure
On Tue, Apr 15, 2014 at 11:27 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Wed, Apr 16, 2014 at 5:00 AM, Peter Geoghegan p...@heroku.com wrote: On Tue, Apr 15, 2014 at 3:59 PM, Ants Aasma a...@cybertec.at wrote: There's a paper on a non blocking GCLOCK algorithm, that does lock free clock

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Andres Freund
On 2014-04-16 07:55:44 -0500, Merlin Moncure wrote: 1. Bgwriter needs to be improved so that it can help in reducing usage count and finding next victim buffer (run the clock sweep and add buffers to the free list). 2. SetLatch for bgwriter (wakeup bgwriter) when elements in

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Merlin Moncure
On Tue, Apr 15, 2014 at 11:44 PM, Robert Haas robertmh...@gmail.com wrote: I think that the basic problem here is that usage counts increase when buffers are referenced, but they decrease when buffers are evicted, and those two things are not in any necessary way connected to each other. In

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-16 Thread Andres Freund
On 2014-04-16 08:25:23 -0500, Merlin Moncure wrote: The downside of this approach was complexity and difficult to test for edge case complexity. I would like to point out though that while i/o efficiency gains are nice, I think contention issues are the bigger fish to fry. That's my feeling

  1   2   >