Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread fazool mein
On Tue, Oct 26, 2010 at 11:23 AM, Robert Haas  wrote:

> On Tue, Oct 26, 2010 at 2:13 PM, Heikki Linnakangas
>  wrote:
> >
> >> Can you please describe why
> >> walsender reading directly from the buffers was given up? To avoid a lot
> >> of
> >> locking?
> >
> > To avoid locking yes, and complexity in general.
>
> And the fact that it might allow the standby to get ahead of the
> master, leading to silent database corruption.
>
>
I agree that the standby might get ahead, but this doesn't necessarily lead
to database corruption. Here, the interesting case is what happens when the
primary fails, which can lead to *either* of the following two cases:
1) The standby, due to some triggering mechanism, becomes the new primary.
In this case, even if the standby was ahead, its fine.
2) The primary comes back as primary. In this case, the standby will connect
again to the primary. At this point, *if* somehow we are able to detect that
the standby is ahead, then we should abort the standby and create a standby
from scratch.

I agree with Heikki that going through all this trouble only makes sense if
there is a huge performance boost.


Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread fazool mein
> Might I suggest adopting the same technique walsender does, ie just read
> the data back from disk?  There's a reason why we gave up trying to have
> walsender read directly from the buffers.
>
>
That is exactly what I do not want to do, i.e. read from disk, as long as
the piece of WAL is available in the buffers. Can you please describe why
walsender reading directly from the buffers was given up? To avoid a lot of
locking?
The locking issue might not be a problem considering synchronous
replication. In synchronous replication, the primary will anyways wait for
the standby to send a confirmation before it can do more WAL inserts. Hence,
reading from buffers might be better in this case.

So, as I understand from the emails, we need to lock both WALWriteLock and
WALInsertLock in exclusive mode for reading from buffers. Agreed?

Thanks.


[HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-22 Thread fazool mein
Hello guys,

I'm writing a function that will read data from the buffer in xlog (i.e.
from XLogCtl->pages and XLogCtl->xlblocks). I want to make sure that I am
doing it correctly.
For reading from the buffer, do I need to lock WALInsertLock or
WALWriteLock? Also, can you explain a bit the usage of 'LW_SHARED'. Can we
use it for read purposes?

Thanks a lot.


Re: [HACKERS] Timeline in the light of Synchronous replication

2010-10-18 Thread fazool mein
I believe we should come up with a universal solution that will solve
potential future problems as well (for example, if in sync replication, we
decide to send writes to standbys in parallel to writing on local disk).

The ideal thing would be to have an id that is incremented on every failure,
and is stored in the WAL. Whenever a standby connects to the primary, it
should send the point p in WAL where streaming should start, plus the id. If
the id is the same at the primary at point p, things are good. Else, we
should tell the standby to either create a new copy from scratch, or delete
some WALs.

@David
> One way to get them in sync without starting from scratch is to use
> rsync from A to B.  This works in the asynchronous case, too. :)

The problem mainly is detecting when one can rsync/stream and when not.

Regards



On Mon, Oct 18, 2010 at 1:57 AM, Dimitri Fontaine wrote:

> Fujii Masao  writes:
> > But, even though we will have done that, it should be noted that WAL in
> > A might be ahead of that in B. For example, A might crash right after
> > writing WAL to the disk and before sending it to B. So when we restart
> > the old master A as the standby after failover, we should need to delete
> > some WAL files (in A) which are inconsistent with the WAL sequence in B.
>
> The idea to send from master to slave the current last applied LSN has
> been talked about already. It would allow to send the WAL content in
> parallel of it's local fsync() on the master, the standby would refrain
> from applying any WAL segment until it knows the master is past that.
>
> Now, given such a behavior, that would mean that when A joins again as a
> standby, it would have to ask B for the current last applied LSN too,
> and would notice the timeline change. Maybe by adding a facility to
> request the last LSN of the previous timeline, and with the behavior
> above applied there (skipping now-known-future-WALs in recovery), that
> would work automatically?
>
> There's still the problem of WALs that have been applied before
> recovery, I don't know that we can do anything here. But maybe we could
> also tweak the CHECKPOINT mecanism not to advance the restart point
> until we know the standbys have already replayed anything up to the
> restart point?
>
> --
> Dimitri Fontaine
> http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
>


[HACKERS] Timeline in the light of Synchronous replication

2010-10-16 Thread fazool mein
Hello guys,

The concept of time line makes sense to me in the case of asynchronous
replication. But in case of synchronous replication, I am not so sure.

When a standby connects to the primary, it checks if both have the same time
line. If not, it doesn't start.

Now, consider the following scenario. The primary (call it A) fails, the
standby (call it B), via a trigger file, comes out of recovery mode
(increments time line id to say 2), and morphs into a primary. Now, lets say
we start the old primary A as a standby, to connect to the new primary B
(which previously was a standby). As the code is at the moment, the old
primary A will not be allowed to connect to the new primary B because A's
timelineid (1) is not equivalent to that of the new primary B (2). Hence, we
need to create a backup again, and setup the standby from scratch.

In the above scenario, if the system was using asynchronous replication,
time lines would have saved us from doing something wrong. But, if we are
using synchronous replication, we know that both A and B would have been in
sync before the failure. In this case, forcing to create a new standby from
scratch because of time lines might not be very desirable if the database is
huge.

Your comments on the above will be appreciated.

Regards


Re: [HACKERS] Heartbeat between Primary and Standby replicas

2010-09-26 Thread fazool mein
Ah, great. I missed looking there.
Thanks.

On Sun, Sep 26, 2010 at 4:19 PM, Fujii Masao  wrote:

> On Mon, Sep 27, 2010 at 7:46 AM, fazool mein  wrote:
> > I checked the code for the keepalive feature. It seems that the socket
> > options are only set on the primary's socket connection. The tcp
> connection
> > created on the secondary for walreceiver does not use the keepalive
> > parameters from the configuration.
>
> You can use libpq keepalive parameters for walreceiver.
>
> keepalives_idle
> keepalives_interval
> keepalives_count
> http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html
>
> Those can be set in primary_connection in recovery.conf.
>
> Regards,
>
> --
> Fujii Masao
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center
>


Re: [HACKERS] Heartbeat between Primary and Standby replicas

2010-09-26 Thread fazool mein
Hello again,

I checked the code for the keepalive feature. It seems that the socket
options are only set on the primary's socket connection. The tcp connection
created on the secondary for walreceiver does not use the keepalive
parameters from the configuration.

Am I correct? Is this intended or some bug?

Thanks.



On Fri, Sep 17, 2010 at 7:05 PM, fazool mein  wrote:

> Apologies. I'm new to Postgres and I didn't see that feature. It satisfies
> what I want to do.
>
> Thanks.
>
>
> On Thu, Sep 16, 2010 at 7:34 PM, Fujii Masao wrote:
>
>> On Fri, Sep 17, 2010 at 6:49 AM, fazool mein 
>> wrote:
>> > I am designing a heartbeat system between replicas to know when a
>> replica
>> > goes down so that necessary measures can be taken. As I see, there are
>> two
>> > ways of doing it:
>> >
>> > 1) Creating a separate heartbeat process on replicas.
>> > 2) Creating a heartbeat message, and sending it over the connection that
>> is
>> > already established between walsender and walreceiver.
>> >
>> > With 2, sending heartbeat from walsender to walreceiver seems trivial.
>> > Sending a heartbeat from walreceiver to walsender seems tricky. Going
>> > through the code, it seems that the walreceiver is always in the
>> > PGASYNC_COPY_OUT mode (except in the beginning when handshaking is
>> done).
>> >
>> > Can you recommend the right way of doing this?
>>
>> The existing keepalive feature doesn't help?
>>
>> Regards,
>>
>> --
>> Fujii Masao
>> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
>> NTT Open Source Software Center
>>
>
>


Re: [HACKERS] Shutting down server from a backend process, e.g. walrceiver

2010-09-22 Thread fazool mein
Thanks for the tips.

In our case, SIGINT makes more sense. I'll use that.

Regards


On Tue, Sep 21, 2010 at 7:50 PM, Fujii Masao  wrote:

> On Wed, Sep 22, 2010 at 2:50 AM, fazool mein  wrote:
> > Yes, I'll be modifying the code. In the walreceiver, I used the following
> to
> > send a shutdown to the postmaster:
> >
> > kill(getppid(), SIGTERM);
>
> You can use the global variable "PostmasterPid" instead of getppid.
> There are three types of shutdown. SIGTERM triggers smart shutdown.
> Smart shutdown is suitable for your case? If not, you might need to
> send SIGINT or SIGQUIT instead.
>
> Regards,
>
> --
> Fujii Masao
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center
>


Re: [HACKERS] Shutting down server from a backend process, e.g. walrceiver

2010-09-21 Thread fazool mein
On Tue, Sep 21, 2010 at 8:32 AM, David Fetter  wrote:

> On Mon, Sep 20, 2010 at 05:48:40PM -0700, fazool mein wrote:
> > Hi,
> >
> > I want to shut down the server under certain conditions that can be
> > checked inside a backend process.  For instance, while running
> > symmetric
>
> Synchronous?
>

I meant streaming :), but the question is in general for any process forked
by the postmaster.


>
> > replication, if the primary dies, I want the the walreceiver to
> > detect that and shutdown the standby.  The reason for shutdown is
> > that I want to execute some other stuff before I start the standby
> > as a primary.  Creating a trigger file doesn't help as it converts
> > the standby into primary at run time.
> >
> > Using proc_exit() inside walreceiver only terminates the walreceiver
> > process, which postgres starts again.  The other way I see is using
> > ereport(PANIC, ...).  Is there some other way to shutdown the main
> > server from within a backend process?
>
> Perhaps I've misunderstood, but since there's already Something
> Else(TM) which takes actions, why not send a message to it so it can
> take appropriate action on the node, starting with shutting it down?
>
>
(wondering)

Thanks.


Re: [HACKERS] Shutting down server from a backend process, e.g. walrceiver

2010-09-21 Thread fazool mein
On Mon, Sep 20, 2010 at 9:44 PM, Fujii Masao  wrote:

> On Tue, Sep 21, 2010 at 9:48 AM, fazool mein  wrote:
> > Hi,
> >
> > I want to shut down the server under certain conditions that can be
> checked
> > inside a backend process. For instance, while running symmetric
> replication,
> > if the primary dies, I want the the walreceiver to detect that and
> shutdown
> > the standby. The reason for shutdown is that I want to execute some other
> > stuff before I start the standby as a primary. Creating a trigger file
> > doesn't help as it converts the standby into primary at run time.
> >
> > Using proc_exit() inside walreceiver only terminates the walreceiver
> > process, which postgres starts again. The other way I see is using
> > ereport(PANIC, ...). Is there some other way to shutdown the main server
> > from within a backend process?
>
> Are you going to change the source code? If yes, you might be able to
> do that by making walreceiver send the shutdown signal to postmaster.
>
>
Yes, I'll be modifying the code. In the walreceiver, I used the following to
send a shutdown to the postmaster:

kill(getppid(), SIGTERM);


> If no, I think that a straightforward approach is to use a clusterware
> like pacemaker. That is, you need to make a clusterware periodically
> check the master and cause the standby to end when detecting the crash
> of the master.
>
>
This was another option, but I have to modify the code for this particular
case.

Thanks for your help.

Regards,


[HACKERS] Shutting down server from a backend process, e.g. walrceiver

2010-09-20 Thread fazool mein
Hi,

I want to shut down the server under certain conditions that can be checked
inside a backend process. For instance, while running symmetric replication,
if the primary dies, I want the the walreceiver to detect that and shutdown
the standby. The reason for shutdown is that I want to execute some other
stuff before I start the standby as a primary. Creating a trigger file
doesn't help as it converts the standby into primary at run time.

Using proc_exit() inside walreceiver only terminates the walreceiver
process, which postgres starts again. The other way I see is using
ereport(PANIC, ...). Is there some other way to shutdown the main server
from within a backend process?

Thanks.


Re: [HACKERS] Heartbeat between Primary and Standby replicas

2010-09-17 Thread fazool mein
Apologies. I'm new to Postgres and I didn't see that feature. It satisfies
what I want to do.

Thanks.

On Thu, Sep 16, 2010 at 7:34 PM, Fujii Masao  wrote:

> On Fri, Sep 17, 2010 at 6:49 AM, fazool mein  wrote:
> > I am designing a heartbeat system between replicas to know when a replica
> > goes down so that necessary measures can be taken. As I see, there are
> two
> > ways of doing it:
> >
> > 1) Creating a separate heartbeat process on replicas.
> > 2) Creating a heartbeat message, and sending it over the connection that
> is
> > already established between walsender and walreceiver.
> >
> > With 2, sending heartbeat from walsender to walreceiver seems trivial.
> > Sending a heartbeat from walreceiver to walsender seems tricky. Going
> > through the code, it seems that the walreceiver is always in the
> > PGASYNC_COPY_OUT mode (except in the beginning when handshaking is done).
> >
> > Can you recommend the right way of doing this?
>
> The existing keepalive feature doesn't help?
>
> Regards,
>
> --
> Fujii Masao
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center
>


[HACKERS] Heartbeat between Primary and Standby replicas

2010-09-16 Thread fazool mein
Hello everyone,

I am designing a heartbeat system between replicas to know when a replica
goes down so that necessary measures can be taken. As I see, there are two
ways of doing it:

1) Creating a separate heartbeat process on replicas.
2) Creating a heartbeat message, and sending it over the connection that is
already established between walsender and walreceiver.

With 2, sending heartbeat from walsender to walreceiver seems trivial.
Sending a heartbeat from walreceiver to walsender seems tricky. Going
through the code, it seems that the walreceiver is always in the
PGASYNC_COPY_OUT mode (except in the beginning when handshaking is done).

Can you recommend the right way of doing this?

Thank you.

Regards

---
Postgres version = 9.0 beta-4


Re: [HACKERS] Synchronous replication - patch status inquiry

2010-08-31 Thread fazool mein
Thanks!

I'll wait for the merging then; there is no point in benchmarking otherwise.

Regards

On Tue, Aug 31, 2010 at 6:06 PM, Fujii Masao  wrote:

> On Wed, Sep 1, 2010 at 9:34 AM, Robert Haas  wrote:
> >> There are patches, and the latest from Fujii Masao is probably worth
> >> looking at :)
> >
> > I am pretty sure, however, that the performance will be terrible at
> > this point.  Heikki is working on fixing that, but it ain't done yet.
>
> Yep. The latest WIP code is available in my git repository, but it's
> not worth benchmarking yet. I'll need to merge Heikki's effort and
> the synchronous replication patch.
>
>git://git.postgresql.org/git/users/fujii/postgres.git
>branch: synchrep
>
> Regards,
>
> --
> Fujii Masao
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center
>


[HACKERS] Synchronous replication - patch status inquiry

2010-08-31 Thread fazool mein
Hello everyone,

I'm interested in benchmarking synchronous replication, to see how
performance degrades compared to asynchronous streaming replication.

I browsed through the archive of emails, but things still seem unclear. Do
we have a final agreed upon patch that I can use? Any links for that?

Thanks.

OS = Linux Suse, sles 11, 64-bit
Postgres version = 9.0 beta-4