On Wed, Apr 17, 2013 at 10:11 PM, Amit Kapila amit.kap...@huawei.com wrote:
On Wednesday, April 17, 2013 4:19 PM Florian Pflug wrote:
On Apr17, 2013, at 12:22 , Amit Kapila amit.kap...@huawei.com wrote:
Do you mean to say that as an error has occurred, so it would not be
able to
flush
On Wed, Apr 17, 2013 at 7:49 PM, Florian Pflug f...@phlo.org wrote:
On Apr17, 2013, at 12:22 , Amit Kapila amit.kap...@huawei.com wrote:
Do you mean to say that as an error has occurred, so it would not be able to
flush received WAL, which could result in loss of WAL?
I think even if error
On Wed, Apr 17, 2013 at 12:49:10PM +0200, Florian Pflug wrote:
Fixing this on the receive side alone seems quite messy and fragile.
So instead, I think we should let the master send a shutdown message
after it has sent everything it wants to send, and wait for the client
to acknowledge it
On Apr19, 2013, at 14:46 , Martijn van Oosterhout klep...@svana.org wrote:
On Wed, Apr 17, 2013 at 12:49:10PM +0200, Florian Pflug wrote:
Fixing this on the receive side alone seems quite messy and fragile.
So instead, I think we should let the master send a shutdown message
after it has sent
On Monday, April 15, 2013 1:02 PM Florian Pflug wrote:
On Apr14, 2013, at 17:56 , Fujii Masao masao.fu...@gmail.com wrote:
At fast shutdown, after walsender sends the checkpoint record and
closes the replication connection, walreceiver can detect the close
of connection before receiving all
On Apr17, 2013, at 12:22 , Amit Kapila amit.kap...@huawei.com wrote:
Do you mean to say that as an error has occurred, so it would not be able to
flush received WAL, which could result in loss of WAL?
I think even if error occurs, it will call flush in WalRcvDie(), before
terminating
On Wednesday, April 17, 2013 4:19 PM Florian Pflug wrote:
On Apr17, 2013, at 12:22 , Amit Kapila amit.kap...@huawei.com wrote:
Do you mean to say that as an error has occurred, so it would not be
able to
flush received WAL, which could result in loss of WAL?
I think even if error occurs,
On Apr14, 2013, at 17:56 , Fujii Masao masao.fu...@gmail.com wrote:
At fast shutdown, after walsender sends the checkpoint record and
closes the replication connection, walreceiver can detect the close
of connection before receiving all WAL records. This means that,
even if walsender sends all
On Fri, Apr 12, 2013 at 5:53 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
On 04/11/2013 07:29 PM, Fujii Masao wrote:
On Thu, Apr 11, 2013 at 10:25 PM, Hannu Krosing ha...@2ndquadrant.com
wrote:
You just shut down the old master and let the standby catch
up (takas a few microseconds ;) )
On Fri, Apr 12, 2013 at 7:57 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2013-04-12 02:29:01 +0900, Fujii Masao wrote:
On Thu, Apr 11, 2013 at 10:25 PM, Hannu Krosing ha...@2ndquadrant.com
wrote:
You just shut down the old master and let the standby catch
up (takas a few
On 04/14/2013 05:56 PM, Fujii Masao wrote:
On Fri, Apr 12, 2013 at 7:57 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2013-04-12 02:29:01 +0900, Fujii Masao wrote:
On Thu, Apr 11, 2013 at 10:25 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
You just shut down the old master and let the
On 04/11/2013 07:29 PM, Fujii Masao wrote:
On Thu, Apr 11, 2013 at 10:25 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
You just shut down the old master and let the standby catch
up (takas a few microseconds ;) ) before you promote it.
After this you can start up the former master with
On 2013-04-12 02:29:01 +0900, Fujii Masao wrote:
On Thu, Apr 11, 2013 at 10:25 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
You just shut down the old master and let the standby catch
up (takas a few microseconds ;) ) before you promote it.
After this you can start up the former
On 2013-04-12 11:18:01 +0530, Pavan Deolasee wrote:
On Thu, Apr 11, 2013 at 8:39 PM, Ants Aasma a...@cybertec.at wrote:
On Thu, Apr 11, 2013 at 5:33 PM, Hannu Krosing ha...@2ndquadrant.com
wrote:
On 04/11/2013 03:52 PM, Ants Aasma wrote:
On Thu, Apr 11, 2013 at 4:25 PM, Hannu
On Fri, Apr 12, 2013 at 4:29 PM, Andres Freund and...@2ndquadrant.comwrote:
I don't think that holds true at all. If you look at pg_stat_bgwriter in
any remotely bugs cluster with a hot data set over shared_buffers you'll
notice that a large percentage of writes will have been done by
On 2013-04-12 16:58:44 +0530, Pavan Deolasee wrote:
On Fri, Apr 12, 2013 at 4:29 PM, Andres Freund and...@2ndquadrant.comwrote:
I don't think that holds true at all. If you look at pg_stat_bgwriter in
any remotely bugs cluster with a hot data set over shared_buffers you'll
notice that
On Wednesday, April 10, 2013 10:31 PM Fujii Masao wrote:
On Thu, Apr 11, 2013 at 1:44 AM, Shaun Thomas
stho...@optionshouse.com wrote:
On 04/10/2013 11:40 AM, Fujii Masao wrote:
Strange. If this is really true, shared disk failover solution is
fundamentally broken because the standby
On Thu, Apr 11, 2013 at 10:09 AM, Amit Kapila amit.kap...@huawei.com wrote:
Consider the case old-master crashed during flushing the data page, now you
would need full page image from new-master.
It might so happen that in new-master Checkpoint would have purged (reused)
the log file's from
Hello,
The only potential use case for this that I can see, would be for system
maintenance and a controlled failover. I agree: that's a major PITA when
doing DR testing, but I personally don't think this is the way to fix that
particular edge case.
This is the use case we are trying to address
On 04/11/2013 01:26 PM, Sameer Thakur wrote:
Hello,
The only potential use case for this that I can see, would be for
system maintenance and a controlled failover. I agree: that's a major
PITA when doing DR testing, but I personally don't think this is the
way to fix that particular edge
On Thu, Apr 11, 2013 at 4:25 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
The proposed fix - halting all writes of data pages to disk and
to WAL files while waiting ACK from standby - will tremendously
slow down all parallel work on master.
This is not what is being proposed. The proposed
Ants Aasma a...@cybertec.at writes:
On Thu, Apr 11, 2013 at 4:25 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
The proposed fix - halting all writes of data pages to disk and
to WAL files while waiting ACK from standby - will tremendously
slow down all parallel work on master.
This is not
On 04/11/2013 03:52 PM, Ants Aasma wrote:
On Thu, Apr 11, 2013 at 4:25 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
The proposed fix - halting all writes of data pages to disk and
to WAL files while waiting ACK from standby - will tremendously
slow down all parallel work on master.
This is
On Thu, Apr 11, 2013 at 5:33 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
On 04/11/2013 03:52 PM, Ants Aasma wrote:
On Thu, Apr 11, 2013 at 4:25 PM, Hannu Krosing ha...@2ndquadrant.com
wrote:
The proposed fix - halting all writes of data pages to disk and
to WAL files while waiting ACK
On Thu, Apr 11, 2013 at 2:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Ants Aasma a...@cybertec.at writes:
We already rely on WAL-before-data to ensure correct recovery. What is
proposed here is to slightly redefine it to require WAL to be
replicated before it is considered to be flushed. This
On Thu, Apr 11, 2013 at 10:25 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
You just shut down the old master and let the standby catch
up (takas a few microseconds ;) ) before you promote it.
After this you can start up the former master with recovery.conf
and it will follow nicely.
No.
On Fri, Apr 12, 2013 at 12:09 AM, Ants Aasma a...@cybertec.at wrote:
On Thu, Apr 11, 2013 at 5:33 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
On 04/11/2013 03:52 PM, Ants Aasma wrote:
On Thu, Apr 11, 2013 at 4:25 PM, Hannu Krosing ha...@2ndquadrant.com
wrote:
The proposed fix - halting
On Thu, Apr 11, 2013 at 8:39 PM, Ants Aasma a...@cybertec.at wrote:
On Thu, Apr 11, 2013 at 5:33 PM, Hannu Krosing ha...@2ndquadrant.com
wrote:
On 04/11/2013 03:52 PM, Ants Aasma wrote:
On Thu, Apr 11, 2013 at 4:25 PM, Hannu Krosing ha...@2ndquadrant.com
wrote:
The proposed fix -
it's one of the reasons why a fresh base backup is required when starting
old master as new standby? If yes, I agree with you. I've often heard the
complaints about a backup when restarting new standby. That's really big
problem.
I think Fujii Masao is on the same page.
In case of syncrep the
(5) *The master then forces a write of the data page related to this
transaction.*
*Sorry, this is incorrect. Whenever the master writes the data page it
checks that the WAL record is written in standby till that LSN. *
*
*
While master is waiting to force a write (point 5) for this data page,
On Wednesday, April 10, 2013 3:42 PM Samrat Revagade wrote:
(5) The master then forces a write of the data page related to this
transaction.
Sorry, this is incorrect. Whenever the master writes the data page it
checks that the WAL record is written in standby till that LSN.
While master is
Amit Kapila amit.kap...@huawei.com writes:
On Wednesday, April 10, 2013 3:42 PM Samrat Revagade wrote:
Sorry, this is incorrect. Streaming replication continuous, master is not
waiting, whenever the master writes the data page it checks that the WAL
record is written in standby till that LSN.
On 2013-04-10 10:10:31 -0400, Tom Lane wrote:
Amit Kapila amit.kap...@huawei.com writes:
On Wednesday, April 10, 2013 3:42 PM Samrat Revagade wrote:
Sorry, this is incorrect. Streaming replication continuous, master is not
waiting, whenever the master writes the data page it checks that the
On 04/10/2013 09:10 AM, Tom Lane wrote:
IOW, I wouldn't consider skipping the rsync even if I had a feature
like this.
Totally. Out in the field, we consider the old database corrupt the
moment we fail over. There is literally no way to verify the safety of
any data along the broken chain,
On Wed, Apr 10, 2013 at 11:26 PM, Shaun Thomas stho...@optionshouse.com wrote:
On 04/10/2013 09:10 AM, Tom Lane wrote:
IOW, I wouldn't consider skipping the rsync even if I had a feature
like this.
Totally. Out in the field, we consider the old database corrupt the moment
we fail over.
On 04/10/2013 11:40 AM, Fujii Masao wrote:
Strange. If this is really true, shared disk failover solution is
fundamentally broken because the standby needs to start up with the
shared corrupted database at the failover.
How so? Shared disk doesn't use replication. The point I was trying to
On Wed, Apr 10, 2013 at 11:16 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2013-04-10 10:10:31 -0400, Tom Lane wrote:
Amit Kapila amit.kap...@huawei.com writes:
On Wednesday, April 10, 2013 3:42 PM Samrat Revagade wrote:
Sorry, this is incorrect. Streaming replication continuous,
On Thu, Apr 11, 2013 at 1:44 AM, Shaun Thomas stho...@optionshouse.com wrote:
On 04/10/2013 11:40 AM, Fujii Masao wrote:
Strange. If this is really true, shared disk failover solution is
fundamentally broken because the standby needs to start up with the
shared corrupted database at the
On Wed, Apr 10, 2013 at 7:44 PM, Shaun Thomas stho...@optionshouse.com wrote:
On 04/10/2013 11:40 AM, Fujii Masao wrote:
Strange. If this is really true, shared disk failover solution is
fundamentally broken because the standby needs to start up with the
shared corrupted database at the
Ants Aasma a...@cybertec.at writes:
We already rely on WAL-before-data to ensure correct recovery. What is
proposed here is to slightly redefine it to require WAL to be
replicated before it is considered to be flushed. This ensures that no
data page on disk differs from the WAL that the slave
2013-04-10 18:46 keltezéssel, Fujii Masao írta:
On Wed, Apr 10, 2013 at 11:16 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2013-04-10 10:10:31 -0400, Tom Lane wrote:
Amit Kapila amit.kap...@huawei.com writes:
On Wednesday, April 10, 2013 3:42 PM Samrat Revagade wrote:
Sorry, this is
On 2013-04-10 20:39:25 +0200, Boszormenyi Zoltan wrote:
2013-04-10 18:46 keltezéssel, Fujii Masao írta:
On Wed, Apr 10, 2013 at 11:16 PM, Andres Freund and...@2ndquadrant.com
wrote:
On 2013-04-10 10:10:31 -0400, Tom Lane wrote:
Amit Kapila amit.kap...@huawei.com writes:
On Wednesday, April
What Samrat is proposing here is that WAL is not flushed to the OS before
it is acked by a synchronous replica so recovery won't go past the
timeline change made in failover, making it necessary to take a new
base backup to resync with the new master.
Actually we are proposing that the data
On Tue, Apr 9, 2013 at 9:42 AM, Samrat Revagade
revagade.sam...@gmail.com wrote:
What Samrat is proposing here is that WAL is not flushed to the OS before
it is acked by a synchronous replica so recovery won't go past the
timeline change made in failover, making it necessary to take a new
base
On 04/08/2013 12:34 PM, Samrat Revagade wrote:
Hello,
We have been trying to figure out possible solutions to the following
problem in streaming replication Consider following scenario:
If master receives commit command, it writes and flushes commit WAL
records to the disk, It also writes
Hello,
We have been trying to figure out possible solutions to the following
problem in streaming replication Consider following scenario:
If master receives commit command, it writes and flushes commit WAL records
to the disk, It also writes and flushes data page related to this
transaction.
On 04/08/2013 05:34 AM, Samrat Revagade wrote:
One solution to avoid this situation is have the master send WAL
records to standby and wait for ACK from standby committing WAL files
to disk and only after that commit data page related to this
transaction on master.
Isn't this basically what
Samrat Revagade revagade.sam...@gmail.com writes:
We have been trying to figure out possible solutions to the following
problem in streaming replication Consider following scenario:
If master receives commit command, it writes and flushes commit WAL records
to the disk, It also writes and
On Mon, Apr 8, 2013 at 6:50 PM, Shaun Thomas stho...@optionshouse.com wrote:
On 04/08/2013 05:34 AM, Samrat Revagade wrote:
One solution to avoid this situation is have the master send WAL
records to standby and wait for ACK from standby committing WAL files
to disk and only after that commit
On 2013-04-08 19:26:33 +0300, Ants Aasma wrote:
On Mon, Apr 8, 2013 at 6:50 PM, Shaun Thomas stho...@optionshouse.com wrote:
On 04/08/2013 05:34 AM, Samrat Revagade wrote:
One solution to avoid this situation is have the master send WAL
records to standby and wait for ACK from standby
On Mon, Apr 8, 2013 at 7:38 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2013-04-08 19:26:33 +0300, Ants Aasma wrote:
Not exactly. Sync-rep ensures that commit success is not sent to the
client before a synchronous replica acks the commit record. What
Samrat is proposing here is that WAL
On Mon, Apr 8, 2013 at 7:34 PM, Samrat Revagade
revagade.sam...@gmail.com wrote:
Hello,
We have been trying to figure out possible solutions to the following problem
in streaming replication Consider following scenario:
If master receives commit command, it writes and flushes commit WAL
52 matches
Mail list logo