Re: [HACKERS] Spreading full-page writes

2014-06-02 Thread Amit Kapila
On Mon, Jun 2, 2014 at 6:04 PM, Fujii Masao wrote: > On Wed, May 28, 2014 at 1:10 PM, Amit Kapila wrote: > > > IIUC in DBW mechanism, we need to have a temporary sequential > > log file of fixed size which will be used to write data before the data > > gets written to its actual location in table

Re: [HACKERS] Spreading full-page writes

2014-06-02 Thread Fujii Masao
On Wed, May 28, 2014 at 1:10 PM, Amit Kapila wrote: > On Tue, May 27, 2014 at 1:19 PM, Fujii Masao wrote: >> On Tue, May 27, 2014 at 3:57 PM, Simon Riggs >> wrote: >> > The requirements we were discussing were around >> > >> > A) reducing WAL volume >> > B) reducing foreground overhead of writin

Re: [HACKERS] Spreading full-page writes

2014-05-29 Thread Robert Haas
On Tue, May 27, 2014 at 8:15 AM, Heikki Linnakangas wrote: > Since you will be flushing the buffers one "redo partition" at a time, you > would want to allow the OS to do merge the writes within a partition as much > as possible. So my even-odd split would in fact be pretty bad. Some sort of > str

Re: [HACKERS] Spreading full-page writes

2014-05-28 Thread Heikki Linnakangas
On 05/28/2014 09:41 AM, Simon Riggs wrote: On 27 May 2014 13:20, Heikki Linnakangas wrote: On 05/27/2014 03:18 PM, Simon Riggs wrote: IIRC Koichi had a patch for prefetch during recovery. Heikki, is that the reason you also discussed changing the WAL record format to allow us to identify the

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Simon Riggs
On 27 May 2014 18:18, Jeff Janes wrote: > On Mon, May 26, 2014 at 8:15 PM, Robert Haas wrote: >> >> On Mon, May 26, 2014 at 1:22 PM, Heikki Linnakangas >> wrote: >> >>I don't think we know that. The server might have crashed before that >> >>second record got generated. (This appears to be an u

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Simon Riggs
On 27 May 2014 13:20, Heikki Linnakangas wrote: > On 05/27/2014 03:18 PM, Simon Riggs wrote: >> >> IIRC Koichi had a patch for prefetch during recovery. Heikki, is that >> the reason you also discussed changing the WAL record format to allow >> us to identify the blocks touched by recovery more ea

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Amit Kapila
On Tue, May 27, 2014 at 1:19 PM, Fujii Masao wrote: > On Tue, May 27, 2014 at 3:57 PM, Simon Riggs wrote: > > The requirements we were discussing were around > > > > A) reducing WAL volume > > B) reducing foreground overhead of writing FPWs - which spikes badly > > after checkpoint and the overhe

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Jeff Janes
On Mon, May 26, 2014 at 8:15 PM, Robert Haas wrote: > On Mon, May 26, 2014 at 1:22 PM, Heikki Linnakangas > wrote: > >>I don't think we know that. The server might have crashed before that > >>second record got generated. (This appears to be an unfixable flaw in > >>this proposal.) > > > > The

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Simon Riggs
On 27 May 2014 07:42, Greg Stark wrote: > On Tue, May 27, 2014 at 10:07 AM, Heikki Linnakangas > wrote: >> >> On 05/26/2014 02:26 PM, Greg Stark wrote: >>> Another idea would be to have separate checkpoints for each buffer >>> partition. You would have to start recovery from the oldest check

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Heikki Linnakangas
On 05/27/2014 03:18 PM, Simon Riggs wrote: IIRC Koichi had a patch for prefetch during recovery. Heikki, is that the reason you also discussed changing the WAL record format to allow us to identify the blocks touched by recovery more easily? Yeah, that was one use case I had in mind for the WAL

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Simon Riggs
On 27 May 2014 03:49, Fujii Masao wrote: >> So that gives us a few approaches >> >> * Compressing FPWs gives A >> * Background FPWs gives us B >>which look like we can combine both ideas >> >> * Double-buffering would give us A and B, but not C >>and would be incompatible with other two i

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Heikki Linnakangas
On 05/27/2014 02:42 PM, Greg Stark wrote: On Tue, May 27, 2014 at 10:07 AM, Heikki Linnakangas wrote: On 05/26/2014 02:26 PM, Greg Stark wrote: Another idea would be to have separate checkpoints for each buffer partition. You would have to start recovery from the oldest checkpoint of any o

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Greg Stark
On Tue, May 27, 2014 at 10:07 AM, Heikki Linnakangas wrote: > > On 05/26/2014 02:26 PM, Greg Stark wrote: >> >>> Another idea would be to have separate checkpoints for each buffer >> partition. You would have to start recovery from the oldest checkpoint of >> any of the partitions. > > Yeah. Simon

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Heikki Linnakangas
On 05/26/2014 02:26 PM, Greg Stark wrote: On Mon, May 26, 2014 at 1:22 PM, Heikki Linnakangas wrote: The second record is generated before the checkpoint is finished and the checkpoint record is written. So it will be there. (if you crash before the checkpoint is finished, the in-progress c

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Heikki Linnakangas
On 05/26/2014 11:15 PM, Robert Haas wrote: On Mon, May 26, 2014 at 1:22 PM, Heikki Linnakangas wrote: I don't think we know that. The server might have crashed before that second record got generated. (This appears to be an unfixable flaw in this proposal.) The second record is generated bef

Re: [HACKERS] Spreading full-page writes

2014-05-27 Thread Fujii Masao
On Tue, May 27, 2014 at 3:57 PM, Simon Riggs wrote: > On 25 May 2014 17:52, Heikki Linnakangas wrote: > >> Here's an idea I tried to explain to Andres and Simon at the pub last night, >> on how to reduce the spikes in the amount of WAL written at beginning of a >> checkpoint that full-page writes

Re: [HACKERS] Spreading full-page writes

2014-05-26 Thread Simon Riggs
On 25 May 2014 17:52, Heikki Linnakangas wrote: > Here's an idea I tried to explain to Andres and Simon at the pub last night, > on how to reduce the spikes in the amount of WAL written at beginning of a > checkpoint that full-page writes cause. I'm just writing this down for the > sake of the ar

Re: [HACKERS] Spreading full-page writes

2014-05-26 Thread Robert Haas
On Mon, May 26, 2014 at 1:22 PM, Heikki Linnakangas wrote: >>I don't think we know that. The server might have crashed before that >>second record got generated. (This appears to be an unfixable flaw in >>this proposal.) > > The second record is generated before the checkpoint is finished and the

Re: [HACKERS] Spreading full-page writes

2014-05-26 Thread Greg Stark
On Mon, May 26, 2014 at 1:22 PM, Heikki Linnakangas wrote: > The second record is generated before the checkpoint is finished and the > checkpoint record is written. So it will be there. > > (if you crash before the checkpoint is finished, the in-progress > checkpoint is no good for recovery any

Re: [HACKERS] Spreading full-page writes

2014-05-26 Thread Heikki Linnakangas
On 26 May 2014 20:16:33 EEST, Robert Haas wrote: >On May 25, 2014, at 5:52 PM, Heikki Linnakangas > wrote: >> Here's how this works out during replay: >> >> a) You start WAL replay from the latest checkpoint's Redo-pointer. >> >> When you see a WAL record that's been marked with XLR_FPW_SKIPPED,

Re: [HACKERS] Spreading full-page writes

2014-05-26 Thread Robert Haas
On May 25, 2014, at 5:52 PM, Heikki Linnakangas wrote: > Here's how this works out during replay: > > a) You start WAL replay from the latest checkpoint's Redo-pointer. > > When you see a WAL record that's been marked with XLR_FPW_SKIPPED, don't > replay that record at all. It's OK because we k

Re: [HACKERS] Spreading full-page writes

2014-05-26 Thread Fujii Masao
On Mon, May 26, 2014 at 6:52 AM, Heikki Linnakangas wrote: > Here's an idea I tried to explain to Andres and Simon at the pub last night, > on how to reduce the spikes in the amount of WAL written at beginning of a > checkpoint that full-page writes cause. I'm just writing this down for the > sake