Re: [GENERAL] Postgres HA

2017-02-24 Thread
Julyanto Sutandang  wrote:
> Talking about High Availability, we should understand the basic concept of 
> HA, it is avoiding SPOF (Single Point of Failure). When we use a Loadbalancer 
> (LTM) and that load balancer is single, then you may get HA only for the 
> PostgreSQL but there are another single point of failure, it is the LTM it 
> self. In overall that  topology is not HA. 
> 
> The best configuration for HA i know is using Linux-HA to watch between 2 
> servers and doing failover VIP (Virtual IP) when Master is down or out of 
> service. The best configuration for HA, Servers should be on the same site 
> and uses direct cable connection to ensure dedicated private bandwidth and 
> there are no Single Point of Failure. 
> LinuxHA or pacemaker or corosync will do the Virtual IP swing over from 
> master host to slave host and promote the replica database in slave host 
> become master. 
> 
> There is no single point of failure. 


I'll agree with most of this, especially that avoiding SPOF is the goal.

However, I will point out that if you put both servers in the same site that 
you've created a SPOF. If you lose power at that site, or construction workers 
takes out the network cable to the building, or any other thing that can happen 
to take the whole site down, any of that will cause total failure.

If you really want to avoid that, then the 2 servers need to sit multiple 
miles/kilometers apart and be served by a dedicated connection or else 2 
different network providers (I've heard differing opinions on the best way) 
that is 100Mb or higher to adequately deal with the various replication issues 
that must be addressed. If the load is high enough, you may need multiple lines 
to be bonded or else even Gb.

Of course, the OP may not need that level of HA, but it is something that 
should be asked and answered by him and his organization.

It's something we deal with for our larger customers. We currently solve the 
replication with DRBD (for the DB) and csync2 (for the application code and 
logs). We are slowly considering the question if we'd be better off abandoning 
DRBD and using Pg's builtin replication, but the jury is still out on that as 
we haven't had enough time to fully figure it out. We'd considered ZFS 
replication for a while, but found ZFS to be too slow for the DB despite it's 
other useful features like replication (at least for the hardware that we had).

Personally, I find HA easy to understand but hard to implement well -- 
especially without a large budget. :)

HTH,
Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres HA

2017-02-23 Thread Julyanto Sutandang
Dear Dylan,

Talking about High Availability, we should understand the basic concept of
HA, it is avoiding SPOF (Single Point of Failure). When we use a
Loadbalancer (LTM) and that load balancer is single, then you may get HA
only for the PostgreSQL but there are another single point of failure, it
is the LTM it self. In overall that  topology is not HA.

The best configuration for HA i know is using Linux-HA to watch between 2
servers and doing failover VIP (Virtual IP) when Master is down or out of
service. The best configuration for HA, Servers should be on the same site
and uses direct cable connection to ensure dedicated private bandwidth and
there are no Single Point of Failure.
LinuxHA or pacemaker or corosync will do the Virtual IP swing over from
master host to slave host and promote the replica database in slave host
become master.

There is no single point of failure.


Julyanto SUTANDANG

Equnix Business Solutions, PT | www.equnix.id
(An Open Source and Open Mind Company)
Plaza Semanggi 9 Fl. Unit 9; Jl. Jend Sudirman Kav 50
Jakarta - Indonesia 12930
T: +6221 2282 <(021)%202282> M: +628164858028 <0816-4858-028>

Caution: The information enclosed in this email (and any attachments) may
be legally privileged and/or confidential and is intended only for the use
of the addressee(s). No addressee should forward, print, copy, or otherwise
reproduce this message in any manner that would allow it to be viewed by
any individual not originally listed as a recipient. If the reader of this
message is not the intended recipient, you are hereby notified that any
unauthorized disclosure, dissemination, distribution, copying or the taking
of any action in reliance on the information herein is strictly prohibited.
If you have received this communication in error, please immediately notify
the sender and delete this message.Unless it is made by the authorized
person, any views expressed in this message are those of the individual
sender and may not necessarily reflect the views of PT Equnix Business
Solutions.

On Thu, Feb 23, 2017 at 5:58 AM, Dylan Luong 
wrote:

> Hi
>
>
>
> I am a DBA at the University of South Australia. For PostgreSQL High
> Availability, we currently have setup a Master/Slave across two datacenters
> using PostgreSQL (WAL) streaming replication. We use an LTM (load balancer)
> server that sits between the application servers and the PostgreSQL server
> that directs connections to the Master (and the Slave if failover occurs).
> We also have watchdog processes on the PostgreSQL servers that polls the
> LTM to determine who is Master and perform automatic failover if required.
> I am looking at options to improve our high availability.
>
> I would like to know how other organizations in different industries
> (other than education) setup High Availability on their PostgreSQL
> environments.
>
> What  tools do you use. Are they commercial licensed products? How is the
> architecture setup and how do you do recovery of new slave.
>
> Your information is greatly appreciated.
>
>
>
> Regards
>
> Dylan
>
>
>
> *Dylan Luong*
>
> *Information Strategy & Technology Services*
>
> University of South Australia
>
> A Building, Room E2-07, Mawson Lakes Campus
>
> MAWSON LAKES
>
> South Australia  5095
>
>
>
> Email:*dylan.lu...@unisa.edu.au *
>
> Phone:+61 8 83023629 <+61%208%208302%203629>
>
> Fax: +61 8 83023577 <+61%208%208302%203577>
>
> WWW:http://www.unisa.edu.au
>
>
>


Re: [GENERAL] Postgres HA

2017-02-23 Thread dinesh kumar
Hi Dylan,

On Thu, Feb 23, 2017 at 4:28 AM, Dylan Luong 
wrote:

> Hi
>
>
>
> I am a DBA at the University of South Australia. For PostgreSQL High
> Availability, we currently have setup a Master/Slave across two datacenters
> using PostgreSQL (WAL) streaming replication. We use an LTM (load balancer)
> server that sits between the application servers and the PostgreSQL server
> that directs connections to the Master (and the Slave if failover occurs).
> We also have watchdog processes on the PostgreSQL servers that polls the
> LTM to determine who is Master and perform automatic failover if required.
> I am looking at options to improve our high availability.
>
> I would like to know how other organizations in different industries
> (other than education) setup High Availability on their PostgreSQL
> environments.
>

Below is the approach we have followed, to achieve the maximum HA with
async streaming replication.

1. Create an instance "I" with "N" number of nodes.

2. Set up the replication among the "N" nodes as "N-1" nodes points to 1
master.

3. Configured 2 physical replication slots to each "N-1" nodes.

4. One replication slot is to receive the archives, and one is for doing
the replication. (Suspected recovery process is slower than the receiver
process)

5. Configured a parallel WAL uploader (Customized program) on master to
wal-backup server. (We needed this for the PITR)

6. Implemented a quorum on "N-1" slaves as one should become as master in
worst cases. (Guaranteed data availability as per RTO settings)

7. Watchdog process which update the pgbouncer configuration from master to
the latest master.

8. Used consul service discovery for identifying the master, slave heart
beats.

9. Once Failover is completed, "N-1"(including old master) follows the new
master by doing a fresh refresh. (Planning to use pg_rewind)

10. Covered the split brain problems by removing the service discovery keys
from consul. (It's delivering the promising results, but need to spend more
time on this).

The above mentioned approach what we have done is similar to your's, but we
needed to handle with multiple slaves rather single one. In case, if you
are looking for any open source tools to implement in your production
servers, then prefer to add repmgr, pgHA, PAF tools into your list. These
open source tools are great and deliver the results as demonstrated.


> What  tools do you use. Are they commercial licensed products? How is the
> architecture setup and how do you do recovery of new slave.
>
> Your information is greatly appreciated.
>
>
>
> Regards
>
> Dylan
>
>
>
> *Dylan Luong*
>
> *Information Strategy & Technology Services*
>
> University of South Australia
>
> A Building, Room E2-07, Mawson Lakes Campus
>
> MAWSON LAKES
>
> South Australia  5095
>
>
>
> Email:*dylan.lu...@unisa.edu.au *
>
> Phone:+61 8 83023629 <+61%208%208302%203629>
>
> Fax: +61 8 83023577 <+61%208%208302%203577>
>
> WWW:http://www.unisa.edu.au
>
>
>



-- 

Regards,
Dinesh
manojadinesh.blogspot.com


Re: [GENERAL] Postgres HA

2017-02-23 Thread Jehan-Guillaume de Rorthais
On Wed, 22 Feb 2017 22:58:10 +
Dylan Luong  wrote:

> Hi
> 
> I am a DBA at the University of South Australia. For PostgreSQL High
> Availability, we currently have setup a Master/Slave across two datacenters
> using PostgreSQL (WAL) streaming replication. We use an LTM (load balancer)
> server that sits between the application servers and the PostgreSQL server
> that directs connections to the Master (and the Slave if failover occurs). We
> also have watchdog processes on the PostgreSQL servers that polls the LTM to
> determine who is Master and perform automatic failover if required.

And how do you deal with split brain ? Fencing? Network partition? What if the
network fail on the master side for 5 minutes? Will the LTM go back to the old
master as soon as the watchdog pool it again?

> I am looking at options to improve our high availability. I would like to
> know how other organizations in different industries (other than education)
> setup High Availability on their PostgreSQL environments. What  tools do you
> use. Are they commercial licensed products? How is the architecture setup and
> how do you do recovery of new slave. Your information is greatly appreciated.

We use Pacemaker with the PAF[1] resource agent. Pacemaker takes great care to
avoid split brain using fencing. It mostly supports local cluster, but it
supports multi-site clusters as well thanks to a layer called "Cluster Ticket
Registry"[2].

HA is a complex subject, it requires some time to get familiar with it. Good
luck :)

[1] http://dalibo.github.io/PAF/
[2] 
http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch15.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres HA

2017-02-22 Thread John R Pierce

On 2/22/2017 2:58 PM, Dylan Luong wrote:
For PostgreSQL High Availability, we currently have setup a 
Master/Slave across two datacenters using PostgreSQL (WAL) streaming 
replication.


have you considered the ramifications of network problems between these 
two datacenters?with a master/slave cluster, you need to avoid the 
'split  brain' scenario where both servers think they are master because 
they can't reach the other.





--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres HA

2017-02-22 Thread JD
Here is the option for HA i.e. EDB failover manager.

https://www.enterprisedb.com/docs/en/2.1/edbfm/EDB_Failover_Manager_Guide.1.02.html

On Thu, Feb 23, 2017 at 10:08 AM, Venkata B Nagothi 
wrote:

>
> On Thu, Feb 23, 2017 at 9:58 AM, Dylan Luong 
> wrote:
>
>> Hi
>>
>>
>>
>> I am a DBA at the University of South Australia. For PostgreSQL High
>> Availability, we currently have setup a Master/Slave across two datacenters
>> using PostgreSQL (WAL) streaming replication. We use an LTM (load balancer)
>> server that sits between the application servers and the PostgreSQL server
>> that directs connections to the Master (and the Slave if failover occurs).
>> We also have watchdog processes on the PostgreSQL servers that polls the
>> LTM to determine who is Master and perform automatic failover if required.
>> I am looking at options to improve our high availability.
>>
>> I would like to know how other organizations in different industries
>> (other than education) setup High Availability on their PostgreSQL
>> environments.
>>
>> What  tools do you use. Are they commercial licensed products? How is the
>> architecture setup and how do you do recovery of new slave.
>>
>> Your information is greatly appreciated.
>>
>
> An efficient High Availability setup for PostgreSQL would depend on
> various factors like Application, Infrastructure and other Business
> Continuity requirements. In your case, you have already mentioned that the
> Load Balancer continuously polls to check the master status and fails over
> to slave when the former is down. If you are looking at improving this
> setup, then, it is important for us to know how the slave promotion is
> happening ? is that done by some tools like pgPool-II ? Well, those are the
> open-source tools available if you wish to automate the slave promotion
> when the master is down. If you are looking at an highly efficient High
> Availability setup would depend on how a) Application failover and b) slave
> promotion are going hand-in-hand. Following are some of the factors to
> consider which can help improve the efficiency in PostgreSQL High
> Availability -
>
> - Application requirements for continued / uninterrupted data operations
> on slave post the fail-over
> - How fast the slave gets promoted when master fails
> - You need to ensure Master and Slave are in absolute sync all the time
> (importantly just before fail-over)
>
> - Various other factors related to infrastructure like Network, database
> load etc.
>
> Hope that helps !
>
> Regards,
> Venkata B N
>
> Database Consultant
>


Re: [GENERAL] Postgres HA

2017-02-22 Thread Venkata B Nagothi
On Thu, Feb 23, 2017 at 9:58 AM, Dylan Luong 
wrote:

> Hi
>
>
>
> I am a DBA at the University of South Australia. For PostgreSQL High
> Availability, we currently have setup a Master/Slave across two datacenters
> using PostgreSQL (WAL) streaming replication. We use an LTM (load balancer)
> server that sits between the application servers and the PostgreSQL server
> that directs connections to the Master (and the Slave if failover occurs).
> We also have watchdog processes on the PostgreSQL servers that polls the
> LTM to determine who is Master and perform automatic failover if required.
> I am looking at options to improve our high availability.
>
> I would like to know how other organizations in different industries
> (other than education) setup High Availability on their PostgreSQL
> environments.
>
> What  tools do you use. Are they commercial licensed products? How is the
> architecture setup and how do you do recovery of new slave.
>
> Your information is greatly appreciated.
>

An efficient High Availability setup for PostgreSQL would depend on various
factors like Application, Infrastructure and other Business Continuity
requirements. In your case, you have already mentioned that the Load
Balancer continuously polls to check the master status and fails over to
slave when the former is down. If you are looking at improving this setup,
then, it is important for us to know how the slave promotion is happening ?
is that done by some tools like pgPool-II ? Well, those are the open-source
tools available if you wish to automate the slave promotion when the master
is down. If you are looking at an highly efficient High Availability setup
would depend on how a) Application failover and b) slave promotion are
going hand-in-hand. Following are some of the factors to consider which can
help improve the efficiency in PostgreSQL High Availability -

- Application requirements for continued / uninterrupted data operations on
slave post the fail-over
- How fast the slave gets promoted when master fails
- You need to ensure Master and Slave are in absolute sync all the time
(importantly just before fail-over)

- Various other factors related to infrastructure like Network, database
load etc.

Hope that helps !

Regards,
Venkata B N

Database Consultant


Re: [GENERAL] Postgres HA

2017-02-22 Thread Andrew Sullivan
Hi,

On Wed, Feb 22, 2017 at 10:58:10PM +, Dylan Luong wrote:
> 
> I am looking at options to improve our high availability.

I suspect the central question you have to answer is, "What do you
mean by 'improve'?"

Do you want to increase the ability to retrieve data?  Decrease the
potential for data loss?  Shorten the recovery time to read/write
availability?  And so on.  The answers for these different questions
will determine which trade-off you need to make.

And rest assured, there is abolutely no solution in the world -- not
even a really expensive commercial one -- that requires no trades.
Distributing data reliably with ACID semantics and no data loss or
corruption or loss in write throughput is not possible, at least
today.  You have to pick which poison you want :)

A

-- 
Andrew Sullivan
a...@crankycanuck.ca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgres HA

2017-02-22 Thread Dylan Luong
Hi

I am a DBA at the University of South Australia. For PostgreSQL High 
Availability, we currently have setup a Master/Slave across two datacenters 
using PostgreSQL (WAL) streaming replication. We use an LTM (load balancer) 
server that sits between the application servers and the PostgreSQL server that 
directs connections to the Master (and the Slave if failover occurs). We also 
have watchdog processes on the PostgreSQL servers that polls the LTM to 
determine who is Master and perform automatic failover if required. I am 
looking at options to improve our high availability.
I would like to know how other organizations in different industries (other 
than education) setup High Availability on their PostgreSQL environments.
What  tools do you use. Are they commercial licensed products? How is the 
architecture setup and how do you do recovery of new slave.
Your information is greatly appreciated.

Regards
Dylan

Dylan Luong
Information Strategy & Technology Services
University of South Australia
A Building, Room E2-07, Mawson Lakes Campus
MAWSON LAKES
South Australia  5095

Email:dylan.lu...@unisa.edu.au
Phone:+61 8 83023629
Fax: +61 8 83023577
WWW:http://www.unisa.edu.au