Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-18 Thread Fujii Masao
On Fri, May 18, 2012 at 3:57 AM, Joshua Berkus j...@agliodbs.com wrote: Yeah, I don't know how I produced the crash in the first place, because of course the self-replica should block all writes, and retesting it I can't get it to accept a write.  Not sure how I did it in the first place.

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-18 Thread Joshua Berkus
It might be easy to detect the situation where the standby has connected to itself, e.g., by assigning ID for each instance and checking whether IDs of two servers are the same. But it seems not easy to detect the circularly-connected two or more standbys. Well, I think it would be fine

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-18 Thread Joshua Berkus
Fujii, You mean that remaster is, after promoting one of standby servers, to make remaining standby servers reconnect to new master and resolve the timeline gap without the shared archive? Yep, that's one of my TODO items, but I'm not sure if I have enough time to implement that for

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-17 Thread Joshua Berkus
Jim, Fujii, Even more fun: 1) Set up a server as a cascading replica (e.g. max_wal_senders = 3, standby_mode = on ) 2) Connect the server to *itself* as a replica. 3) This will work and report success, up until you do your first write. 4) Then ... segfault! - Original Message -

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-17 Thread Ants Aasma
On Thu, May 17, 2012 at 3:42 PM, Joshua Berkus j...@agliodbs.com wrote: Even more fun: 1) Set up a server as a cascading replica (e.g. max_wal_senders = 3, standby_mode = on ) 2) Connect the server to *itself* as a replica. 3) This will work and report success, up until you do your first

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-17 Thread Fujii Masao
On Thu, May 17, 2012 at 12:01 PM, Joshua Berkus j...@agliodbs.com wrote: And: if we still have to ship logs, what's the point in even having cascading replication? At least cascading replication (1) allows you to adopt more flexible configuration of servers, I'm just pretty shocked.  The

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-17 Thread Fujii Masao
On Thu, May 17, 2012 at 10:42 PM, Ants Aasma a...@cybertec.at wrote: On Thu, May 17, 2012 at 3:42 PM, Joshua Berkus j...@agliodbs.com wrote: Even more fun: 1) Set up a server as a cascading replica (e.g. max_wal_senders = 3, standby_mode = on ) 2) Connect the server to *itself* as a

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-17 Thread Joshua Berkus
Yeah, I don't know how I produced the crash in the first place, because of course the self-replica should block all writes, and retesting it I can't get it to accept a write. Not sure how I did it in the first place. So the bug is just that you can connect a server to itself as its own

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-16 Thread Fujii Masao
On Wed, May 16, 2012 at 2:29 AM, Thom Brown t...@linux.com wrote: On 15 May 2012 13:15, Fujii Masao masao.fu...@gmail.com wrote: On Wed, May 16, 2012 at 1:36 AM, Thom Brown t...@linux.com wrote: However, this isn't true when I restart the standby.  I've been informed that this should work fine

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-16 Thread Fujii Masao
On Wed, May 16, 2012 at 3:42 AM, Joshua Berkus j...@agliodbs.com wrote: Fujii, Wait, are you telling me that we *still* can't remaster from streaming replication? What's the remaster? And: if we still have to ship logs, what's the point in even having cascading replication? At least

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-16 Thread Fujii Masao
On Wed, May 16, 2012 at 3:43 AM, Joshua Berkus j...@agliodbs.com wrote: Before restarting it, you need to do pg_basebackup and make a base backup onto the standby again. Since you started the standby without recovery.conf, a series of WAL in the standby has gotten inconsistent with that in

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-16 Thread Thom Brown
On 16 May 2012 11:36, Fujii Masao masao.fu...@gmail.com wrote: On Wed, May 16, 2012 at 2:29 AM, Thom Brown t...@linux.com wrote: On 15 May 2012 13:15, Fujii Masao masao.fu...@gmail.com wrote: On Wed, May 16, 2012 at 1:36 AM, Thom Brown t...@linux.com wrote: However, this isn't true when I

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-16 Thread Jim Nasby
Well, that is a form of testing. :) My point was that we need some kind of regression tests around all the new replication stuff, and if you had some scripts that would be a useful starting point. But it sounds like you haven't gotten that far with it, so... On 5/15/12 10:12 AM, Joshua Berkus

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-16 Thread Jim Nasby
On 5/16/12 10:53 AM, Fujii Masao wrote: On Wed, May 16, 2012 at 3:43 AM, Joshua Berkusj...@agliodbs.com wrote: Before restarting it, you need to do pg_basebackup and make a base backup onto the standby again. Since you started the standby without recovery.conf, a series of WAL in the standby

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-16 Thread Fujii Masao
On Thu, May 17, 2012 at 1:07 AM, Thom Brown t...@linux.com wrote: On 16 May 2012 11:36, Fujii Masao masao.fu...@gmail.com wrote: On Wed, May 16, 2012 at 2:29 AM, Thom Brown t...@linux.com wrote: On 15 May 2012 13:15, Fujii Masao masao.fu...@gmail.com wrote: On Wed, May 16, 2012 at 1:36 AM,

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-16 Thread Joshua Berkus
And: if we still have to ship logs, what's the point in even having cascading replication? At least cascading replication (1) allows you to adopt more flexible configuration of servers, I'm just pretty shocked. The last time we talked about this, at the end of the 9.1 development

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-15 Thread Joshua Berkus
Jim, I didn't get as far as running any tests, actually. All I did was try to set up 3 servers in cascading replication. Then I tried shutting down master-master and promoting master-replica. That's it. - Original Message - On May 13, 2012, at 3:08 PM, Josh Berkus wrote: More

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-15 Thread Thom Brown
On 13 May 2012 16:08, Josh Berkus j...@agliodbs.com wrote: More issues: promoting intermediate standby breaks replication. To be a bit blunt here, has anyone tested cascading replication *at all* before this? So, same setup as previous message. 1. Shut down master-master. 2. pg_ctl

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-15 Thread Fujii Masao
On Wed, May 16, 2012 at 1:36 AM, Thom Brown t...@linux.com wrote: However, this isn't true when I restart the standby.  I've been informed that this should work fine if a WAL archive has been configured (which should be used anyway). The WAL archive should be shared by master-replica and

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-15 Thread Fujii Masao
On Mon, May 14, 2012 at 4:04 AM, Josh Berkus j...@agliodbs.com wrote: Doing some beta testing, managed to produce this issue using the daily snapshot from Tuesday: 1. Created master server, loaded it with a couple dummy databases. 2. Created standby server. 3. Did pg_basebackup -x stream

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-15 Thread Thom Brown
On 15 May 2012 13:15, Fujii Masao masao.fu...@gmail.com wrote: On Wed, May 16, 2012 at 1:36 AM, Thom Brown t...@linux.com wrote: However, this isn't true when I restart the standby.  I've been informed that this should work fine if a WAL archive has been configured (which should be used

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-15 Thread Joshua Berkus
Fujii, Wait, are you telling me that we *still* can't remaster from streaming replication? Why wasn't that fixed in 9.2? And: if we still have to ship logs, what's the point in even having cascading replication? - Original Message - On Wed, May 16, 2012 at 1:36 AM, Thom Brown

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-15 Thread Joshua Berkus
Before restarting it, you need to do pg_basebackup and make a base backup onto the standby again. Since you started the standby without recovery.conf, a series of WAL in the standby has gotten inconsistent with that in the master. So you need a fresh backup to restart the standby. You're

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-14 Thread Jim Nasby
On May 13, 2012, at 3:08 PM, Josh Berkus wrote: More issues: promoting intermediate standby breaks replication. To be a bit blunt here, has anyone tested cascading replication *at all* before this? Josh, do you have scripts that you're using to do this testing? If so can you post them

[HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-13 Thread Josh Berkus
Doing some beta testing, managed to produce this issue using the daily snapshot from Tuesday: 1. Created master server, loaded it with a couple dummy databases. 2. Created standby server. 3. Did pg_basebackup -x stream on standby server 4. Started standby server. 5. Realized I'd forgotten to

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-13 Thread Josh Berkus
More issues: the pg_basebackup -x stream on the cascading replica won't complete until the xlog rotates on the master. (again, this is Tuesday's snapshot). Servers: .226 == master-master, the writeable master .227 == master-replica, a direct replica of master-master .228 == replica-replica, a

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-13 Thread Josh Berkus
More issues: promoting intermediate standby breaks replication. To be a bit blunt here, has anyone tested cascading replication *at all* before this? So, same setup as previous message. 1. Shut down master-master. 2. pg_ctl promote master-replica 3. replication breaks. error message on

Re: [HACKERS] Strange issues with 9.2 pg_basebackup replication

2012-05-13 Thread Thom Brown
On 13 May 2012 20:23, Josh Berkus j...@agliodbs.com wrote: More issues: the pg_basebackup -x stream on the cascading replica won't complete until the xlog rotates on the master.  (again, this is Tuesday's snapshot). This is already on the open items list: