Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-14 Thread Gregory Stark
PFC [EMAIL PROTECTED] writes: Anyway, seq-scan on InnoDB is very slow because, as the btree grows (just like postgres indexes) pages are split and scanning the pages in btree order becomes a mess of seeks. So, seq scan in InnoDB is very very slow unless periodic OPTIMIZE TABLE is applied.

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-13 Thread Jim C. Nasby
On Sun, Jun 10, 2007 at 08:49:24PM +0100, Heikki Linnakangas wrote: Jim C. Nasby wrote: On Thu, Jun 07, 2007 at 10:16:25AM -0400, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Thinking about this whole idea a bit more, it occured to me that the current approach to write all,

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-13 Thread Florian G. Pflug
Heikki Linnakangas wrote: Jim C. Nasby wrote: On Thu, Jun 07, 2007 at 10:16:25AM -0400, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Thinking about this whole idea a bit more, it occured to me that the current approach to write all, then fsync all is really a historical

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-13 Thread PFC
If we extended relations by more than 8k at a time, we would know a lot more about disk layout, at least on filesystems with a decent amount of free space. I doubt it makes that much difference. If there was a significant amount of fragmentation, we'd hear more complaints about seq scan

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-11 Thread ITAGAKI Takahiro
Heikki Linnakangas [EMAIL PROTECTED] wrote: True. On the other hand, if we issue writes in essentially random order, we might fill the kernel buffers with random blocks and the kernel needs to flush them to disk as almost random I/O. If we did the writes in groups, the kernel has better

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-11 Thread Greg Smith
On Mon, 11 Jun 2007, ITAGAKI Takahiro wrote: If the kernel can treat sequential writes better than random writes, is it worth sorting dirty buffers in block order per file at the start of checkpoints? I think it has the potential to improve things. There are three obvious and one subtle

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-11 Thread Heikki Linnakangas
ITAGAKI Takahiro wrote: Heikki Linnakangas [EMAIL PROTECTED] wrote: True. On the other hand, if we issue writes in essentially random order, we might fill the kernel buffers with random blocks and the kernel needs to flush them to disk as almost random I/O. If we did the writes in groups,

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-10 Thread Heikki Linnakangas
Jim C. Nasby wrote: On Thu, Jun 07, 2007 at 10:16:25AM -0400, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Thinking about this whole idea a bit more, it occured to me that the current approach to write all, then fsync all is really a historical artifact of the fact that we

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-09 Thread Jim C. Nasby
On Thu, Jun 07, 2007 at 10:16:25AM -0400, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Thinking about this whole idea a bit more, it occured to me that the current approach to write all, then fsync all is really a historical artifact of the fact that we used to use the

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-08 Thread Heikki Linnakangas
Greg Smith wrote: On Thu, 7 Jun 2007, Heikki Linnakangas wrote: So there's two extreme ways you can use LDC: 1. Finish the checkpoint as soon as possible, without disturbing other activity too much 2. Disturb other activity as little as possible, as long as the checkpoint finishes in a

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-08 Thread Andrew Sullivan
On Fri, Jun 08, 2007 at 09:50:49AM +0100, Heikki Linnakangas wrote: dynamics change. But we must also keep in mind that average DBA doesn't change any settings, and might not even be able or allowed to. That means the defaults should work reasonably well without tweaking the OS settings.

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-08 Thread Greg Smith
On Fri, 8 Jun 2007, Andrew Sullivan wrote: Do you mean change the OS settings or something else? (I'm not sure it's true in any case, because shared memory kernel settings have to be fiddled with in many instances, but I thought I'd ask for clarification.) In a situation where a hosting

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-08 Thread Heikki Linnakangas
Andrew Sullivan wrote: On Fri, Jun 08, 2007 at 09:50:49AM +0100, Heikki Linnakangas wrote: dynamics change. But we must also keep in mind that average DBA doesn't change any settings, and might not even be able or allowed to. That means the defaults should work reasonably well without

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-08 Thread Andrew Sullivan
On Fri, Jun 08, 2007 at 10:33:50AM -0400, Greg Smith wrote: they'd take care of that as part of routine server setup. What wouldn't be reasonable is to expect them to tune obscure parts of the kernel just for your application. Well, I suppose it'd depend on what kind of hosting environment

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-08 Thread Bruce Momjian
Andrew Sullivan wrote: On Fri, Jun 08, 2007 at 10:33:50AM -0400, Greg Smith wrote: they'd take care of that as part of routine server setup. What wouldn't be reasonable is to expect them to tune obscure parts of the kernel just for your application. Well, I suppose it'd depend on what

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-07 Thread Heikki Linnakangas
Greg Smith wrote: On Wed, 6 Jun 2007, Heikki Linnakangas wrote: The original patch uses bgwriter_all_max_pages to set the minimum rate. I think we should have a separate variable, checkpoint_write_min_rate, in KB/s, instead. Completely agreed. There shouldn't be any coupling with the

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-07 Thread Hannu Krosing
Ühel kenal päeval, K, 2007-06-06 kell 11:03, kirjutas Tom Lane: Heikki Linnakangas [EMAIL PROTECTED] writes: GUC summary and suggested default values checkpoint_write_percent = 50 # % of checkpoint interval to spread out writes

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-07 Thread Heikki Linnakangas
Thinking about this whole idea a bit more, it occured to me that the current approach to write all, then fsync all is really a historical artifact of the fact that we used to use the system-wide sync call instead of fsyncs to flush the pages to disk. That might not be the best way to do things

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-07 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Thinking about this whole idea a bit more, it occured to me that the current approach to write all, then fsync all is really a historical artifact of the fact that we used to use the system-wide sync call instead of fsyncs to flush the pages to

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-07 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Thinking about this whole idea a bit more, it occured to me that the current approach to write all, then fsync all is really a historical artifact of the fact that we used to use the system-wide sync call instead of fsyncs to flush

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-07 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Tom Lane wrote: I don't think it's a historical artifact at all: it's a valid reflection of the fact that we don't know enough about disk layout to do low-level I/O scheduling. Issuing more fsyncs than necessary will do little except guarantee a

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-07 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Tom Lane wrote: I don't think it's a historical artifact at all: it's a valid reflection of the fact that we don't know enough about disk layout to do low-level I/O scheduling. Issuing more fsyncs than necessary will do little

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-07 Thread Greg Smith
On Thu, 7 Jun 2007, Heikki Linnakangas wrote: So there's two extreme ways you can use LDC: 1. Finish the checkpoint as soon as possible, without disturbing other activity too much 2. Disturb other activity as little as possible, as long as the checkpoint finishes in a reasonable time. Are

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-07 Thread Gregory Stark
Greg Smith [EMAIL PROTECTED] writes: I'm completely biased because of the workloads I've been dealing with recently, but I consider (2) so much easier to tune for that it's barely worth worrying about. If your system is so underloaded that you can let the checkpoints take their own sweet

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-07 Thread Greg Smith
On Thu, 7 Jun 2007, Gregory Stark wrote: You seem to have imagined that letting the checkpoint take longer will slow down transactions. And you seem to have imagined that I have so much spare time that I'm just making stuff up to entertain myself and sow confusion. I observed some

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-07 Thread Joshua D. Drake
This is really a serious issue with the current design of the database, one that merely changes instead of going away completely if you throw more hardware at it. I'm perversely glad to hear this is torturing more people than just me as it improves the odds the situation will improve. It

[HACKERS] Controlling Load Distributed Checkpoints

2007-06-06 Thread Heikki Linnakangas
I'm again looking at way the GUC variables work in load distributed checkpoints patch. We've discussed them a lot already, but I don't think they're still quite right. Write-phase --- I like the way the write-phase is controlled in general. Writes are throttled so that we spend the

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-06 Thread Gregory Stark
Heikki Linnakangas [EMAIL PROTECTED] writes: GUC summary and suggested default values checkpoint_write_percent = 50 # % of checkpoint interval to spread out writes checkpoint_write_min_rate = 1000 # minimum I/O rate to write

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-06 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: GUC summary and suggested default values checkpoint_write_percent = 50 # % of checkpoint interval to spread out writes checkpoint_write_min_rate = 1000 # minimum I/O rate to write

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-06 Thread Greg Smith
On Wed, 6 Jun 2007, Tom Lane wrote: If we don't know how to tune them, how will the users know? I can tell you a good starting set for them to on a Linux system, but you first have to let me know how much memory is in the OS buffer cache, the typical I/O rate the disks can support, how many

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-06 Thread Greg Smith
On Wed, 6 Jun 2007, Heikki Linnakangas wrote: The original patch uses bgwriter_all_max_pages to set the minimum rate. I think we should have a separate variable, checkpoint_write_min_rate, in KB/s, instead. Completely agreed. There shouldn't be any coupling with the background writer