[HACKERS] logical replication read-only slave

2017-06-15 Thread Maeldron T.
Hello,

I played around a bit with the logical replication in 10.0 beta 1.

My first question was: is it possible to set the "slave" server to run in
(almost) read-only mode?

The current setup is the following:

There is a Rails application running on multiple servers
Two PostgreSQL servers, stream replication
Writes are executed on the master
Some of the reads are executed on the slave
(Nothing new here)

However, it provides me a safety net that I could not execute writes on the
slave by accident. Not only I couldn’t do it, I would also receive a
notification from the software about the attempt as it would throw an
exception.


Let’s say I would switch to logical replication of all tables
Safety net is gone

I could send an explicit command for each session to make it read-only
I could use a read-only role (let’s ignore now I don’t use rules)

But the main attribute of a safety net is the safety. As soon as there
would be a bug, and a session would not send the "set session ..." command,
or the wrong role would be used, the application could write to the
"slave", and that’s not great.

As far as I see, the only solution which provides the same safety level as
the stream replication does would be starting up the "slave" in read-only
mode. In this case, writes would be needed for:

* The replication
* DDL

The DDL could be applied in a specific session as whitelisting is safer
than blacklisting. I think the only missing part is if the subscription
could turn on the writes for itself.

If you think this would make sense, please consider it.

M


Re: [HACKERS] Google Cloud Compute + FreeBSD + PostgreSQL = timecounter issue

2016-11-23 Thread Maeldron T.

On 23.11.16 20:43, Maeldron T. wrote:



pg_test_timing doesn’t show the problem, or I read the output wrong.


Or it does. I checked another output than the one I attached at the end.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Google Cloud Compute + FreeBSD + PostgreSQL = timecounter issue

2016-11-23 Thread Maeldron T.

In short:

The available timecounters on Google Compute Instances seem to be random.

The setting in the official FreeBSD image is wrong (not available on any 
of my test instances). FreeBSD will pick up a timecounter at random.


When either the TSC or the TSC-low counter is used, explain analyze 
behaves normally. The system clock will be wrong with a few seconds in 
each minute. ntpd won’t (and shouldn’t) fix that. Daemons panic. Time 
travel gets real.


When ACPI-fast is used, the system clock stays normal. However, an 
"explain analyze select count(1) from table" will run for 3ms 
instead of 300ms.


pg_test_timing doesn’t show the problem, or I read the output wrong.


In long:

https://forums.freebsd.org/threads/58666/


Notes:

$ pg_test_timing
Testing timing overhead for 3 seconds.
Per loop time including overhead: 6346.80 nsec
Histogram of timing durations:
< usec   % of total  count
 1  0.0  0
 2  0.0  0
 4  0.0  0
 8 96.37600 42
16  2.26939  10727
32  0.62727   2965
64  0.08801416
   128  0.56634   2677
   256  0.04824228
   512  0.01523 72
  1024  0.00508 24
  2048  0.00275 13
  4096  0.00085  4
  8192  0.00042  2
 16384  0.0  0
 32768  0.00021  1
 65536  0.0  0
131072  0.00021  1



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL super HA (High Availability) conception for 9.5+

2015-11-18 Thread Maeldron T.
Hello,

Foreword:

Unfortunately, I have no time to read the mailing lists and attend events
like PostgreSQL and NoSQL. Some of the ideas came from MongoDB and
Cassandra. The inspiration was the pg_rewind.

There is little new here, it’s a wish-list put together, considering what
could be possible in the foreseeable future. It’s likely that people worked
on a similar or a better concept. But let me try.


Reasons:

Downtime is bad. PostgreSQL failover requires manual intervention (client
configuration or host or DNS editing). Third party tools (in my experience)
don’t offer the same stability and quality as PostgreSQL is. Also, this
concept wouldn’t work without pg_rewind.

Less software means less bugs.


Goals:

Providing near to 100% HA with minimal manual intervention. Minimizing
possible human errors during failover. Making startup founders sleep well
in the night. Automatic client configuration. Avoiding split brains.


Extras:

Automatic streaming chain configuration.


No-goals:

Multi-master replication. Sharding. Proxying. Load balancing.


Why these:

It’s better to have a working technology now than a futuristic solution in
the future. For many applications, stability and HA are more important than
sharding or multi-master.


The concept:

You can set up a single-master PostgreSQL cluster with two or more nodes
that can failover several times without manual re-configuration. Restarting
the client isn’t needed if it’s smart enough to reconnect. Third party
software isn’t needed. Proxying isn’t needed.


Cases:


Running the cluster:

The cluster is running. There is one master. Every other nodes are
hot-standby slaves.

The client-driver accepts several hostname(:port) values in the connection
parameters. They must belong to the same cluster. (The cluster’s name might
be provided too).

The rest of the options (username, database name) are the same and needed
only once. It’s not necessary to list every hosts. (Even listing one host
is enough but not recommended).

The client connects to one of the given hosts. If the node is running and
it’s a slave, it tells the client which host the master is. The client
connects to the master, even if the master was not listed in the connection
parameters.

It’s should be possible that the client stays connected to the slave for
read-only queries if the application wants to do that.

If the node the client tried connect to isn’t working, the client tries
another node and so.


Manually promoting a new master:

The administrator promotes any of the slaves. The slave tells the master to
gracefully stop. The master stops executing queries. It waits until the
slave (the new master) receives all the replication log. The new master is
promoted. The old master becomes a slave. (It might use pg_rewind).

The old master asks the connected clients to reconnect to the new master.
Then it drops the existing connections. It accepts new connections though
and tells them who the master is.


Manual step-down of the master:

The administrator kindly asks the master to stop being the master. The
cluster elects a new master. Then it’s the same as promoting a new master.


Manual shutdown of the master:

It’s same as step-down but the master won’t run as a slave until it’s
started up again.


Automatic failover:

The master stops responding for a given period. The majority of the cluster
elects a new master. Then the process is the same as manual promotion.

When the old master starts up, the cluster tells it that it is not a master
anymore. It does pg_rewind and acts as a slave.

Automatic failover can happen again without human intervention. The clients
are reconnected to the new master each time.


Automatic failover without majority:

It’s possible to tell in the config which server may act as a master when
there is no majority to vote.


Replication chain:

There are two cases. 1: All the slaves connect to the master. 2: One slave
connects to the master and the rest of the nodes replicate from this slave.


Configuration:

Every node should have a “recovery.conf” that is not renamed on promotion.

cluster_name: an identifier for the cluster. Why not.

hosts: list of the hosts. It is recommended but not needed to include every
hosts in every file. It could work as the driver, discovering the rest of
the cluster.

master_priority: integer. How likely this node becomes the new master on
failover (except manual promotion). A working cluster should not elect a
new master just because it has higher priority than the current one.
Election happens only for the described reasons above.

slave_priority: integer. If any running node has this value larger than 0,
the replication node is also elected, and the rest of the slaves replicate
from the elected slave. Otherwise, they replicate from the master.

primary_master: boolean. The node may run as master without elected by the
majority. (This is not needed on manual promotion or shutdown. See
bookkeeping.)

safe: boolean. If this is 

Re: [HACKERS] Failback to old master

2015-01-24 Thread Maeldron T.
2014-11-13 9:05 GMT+01:00 Heikki Linnakangas hlinnakan...@vmware.com:


 Right. You have to be careful to make sure the standby really did fully
 catch up with the master, though. If it happens that the replication
 connection is momentarily down when you shut down the master, for example,
 then the master won't wait for the standby. You can use pg_controlinfo to
 verify that, before promoting the standby.


 - Heikki


Dear Heikki,

would you please tell me which line I should check to be 100% sure that
everything was sent to the slave when the master was shut down?

Latest checkpoint location:   1F/B842C3D8
Prior checkpoint location:1F/B837B9B8
Latest checkpoint's REDO location:1F/B841A050
Latest checkpoint's REDO WAL file:0001001F00B8
Latest checkpoint's TimeLineID:   1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:  0/15845855
Latest checkpoint's NextOID:  450146
Latest checkpoint's NextMultiXactId:  2250
Latest checkpoint's NextMultiOffset:  4803
Latest checkpoint's oldestXID:984
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  15845855
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Is it the first line (Latest checkpoint location) or do I have to check
more/else?

I plan to do this on the weekend.

Thank you.

M.


Re: [HACKERS] Failback to old master

2014-11-16 Thread Maeldron T.

On 16/11/14 13:13, didier wrote:

I think you have to add
recovery_target_timeline = '2'
in recovery.conf
with '2' being the new primary timeline .
cf http://www.postgresql.org/docs/9.4/static/recovery-target-settings.html



Thank you.

Based on the link I have added:
recovery_target_timeline = 'latest'

And now it works.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failback to old master

2014-11-15 Thread Maeldron T.

On 12/11/14 14:28, Ants Aasma wrote:

On Tue, Nov 11, 2014 at 11:52 PM, Maeldron T. maeld...@gmail.com wrote:

As far as I remember (I can’t test it right now but I am 99% sure) promoting 
the slave makes it impossible to connect the old master to the new one without 
making a base_backup. The reason is the timeline change. It complains.

A safely shut down master (-m fast is safe) can be safely restarted as
a slave to the newly promoted master. Fast shutdown shuts down all
normal connections, does a shutdown checkpoint and then waits for this
checkpoint to be replicated to all active streaming clients. Promoting
slave to master creates a timeline switch, that prior to version 9.3
was only possible to replicate using the archive mechanism. As of
version 9.3 you don't need to configure archiving to follow timeline
switches, just add a recovery.conf to the old master to start it up as
a slave and it will fetch everything it needs from the new master.

I took your advice and I understood that removing the recovery.conf 
followed by a restart is wrong. I will not do that on my production servers.


However, I can't make it work with promotion. What did I wrong? It was 
9.4beta3.


mkdir 1
mkdir 2
initdb -D 1/
edit config: change port, wal_level to hot_standby, hot_standby to on, 
max_wal_senders=7, wal_keep_segments=100, uncomment replication in hba.conf

pg_ctl -D 1/ start
createdb -p 5433
psql -p 5433
pg_basebackup -p 5433 -R -D 2/
mcedit 2/postgresql.conf change port
chmod -R 700 1
chmod -R 700 2
pg_ctl -D 2/ start
psql -p 5433
psql -p 5434
everything works
pg_ctl -D 1/ stop
pg_ctl -D 2/ promote
psql -p 5434
cp 2/recovery.done 1/recovery.conf
mcedit 1/recovery.conf change port
pg_ctl -D 1/ start

LOG:  replication terminated by primary server
DETAIL:  End of WAL reached on timeline 1 at 0/3000AE0.
LOG:  restarted WAL streaming at 0/300 on timeline 1
LOG:  replication terminated by primary server
DETAIL:  End of WAL reached on timeline 1 at 0/3000AE0.

This is what I experienced in the past when I tried with promote. The 
old master disconnects from the new. What am I missing?





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failback to old master

2014-11-11 Thread Maeldron T.

Hi,

2014-10-29 17:46 GMT+01:00 Robert Haas robertmh...@gmail.com 
mailto:robertmh...@gmail.com:



   Yes, but after the restart, the slave will also rewind to the most
   recent restart-point to begin replay, and some of the sanity checks
   that recovery.conf enforces will be lost during that replay.  A safe
   way to do this might be to shut down the master, make a note of the
   ending WAL position on the master, and then promote the slave (without
   shutting it down) once it's reached that point in replay.


As far as I remember (I can’t test it right now but I am 99% sure) 
promoting the slave makes it impossible to connect the old master to the 
new one without making a base_backup. The reason is the timeline change. 
It complains.


The only way to do this is:
1. Stop the master
2. Restart the slave without recovery conf
3. Restart the old master master with a recovery conf.

I have done this a couple of times back and forward and it worked. I 
mean it didn't complain.



 I also thought that if there was a crash on the original master
   and it
 applied WAL entries on itself that are not presented on the slave
   then it
 will throw an error when I try to connect it to the new master
   (to the old
 slave).

   I don't think you're going to be that lucky.

 It would be nice to know as creating a base_backup takes much time.

   rsync can speed things up by copying only changed data, but yes,
   it's a problem.


Actually I am more afraid of rsyncing database data files between the 
nodes than trusting the postgresql error log. There is no technical 
reason for that, it's more like psychological.


Is it possible that the new master has unreplicated changes and won't 
notice that when connecting to the old slave? I thought that wal records 
might have unique identifiers but I don't know the details.






[HACKERS] Failback to old master

2014-10-29 Thread Maeldron T.
Hello,

I swear I have read a couple of old threads. Yet I am not sure if it safe
to failback to the old master in case of async replication without base
backup.

Considering:
I have the latest 9.3 server
A: master
B: slave
B is actively connected to A

I shut down A manually with -m fast (it's the default FreeBSD init script
setting)
I remove the recovery.conf from B
I restart B
I create a recovery.conf on A
I start A
I see nothing wrong in the logs
I go for a lunch
I shut down B
I remove the recovery.conf on AI restart A
I restore the recovery.conf on B
I start B
I see nothing wrong in the logs and I see that replication is working

Can I say that my data is safe in this case?

If the answer is yes, is it safe to do this if there was a power outage on
A instead of manual shutdown? Considering that the log says nothing wrong.
(Of course if it complains I'd do base backup from B).

Thank you,

M.


Re: [HACKERS] Failback to old master

2014-10-29 Thread Maeldron T.
Thank you, Robert.

I thought that removing the recovery.conf file makes the slave master only
after the slave was restarted. (Unlike creating the a trigger_file). Isn't
this true?

I also thought that if there was a crash on the original master and it
applied WAL entries on itself that are not presented on the slave then it
will throw an error when I try to connect it to the new master (to the old
slave).

It would be nice to know as creating a base_backup takes much time.

As for the other case, when there was no crash, safe swapping the master
and the slave two times without creating base_backups makes the upgrading
of the OS much easier (with only a couple of seconds down-time).

I am afraid to try on until production someone confirms that it's safe. I
seems to work though (but I don't like to bet).

M.

2014-10-29 15:41 GMT+01:00 Robert Haas robertmh...@gmail.com:

 On Wed, Oct 29, 2014 at 6:21 AM, Maeldron T. maeld...@gmail.com wrote:
  I swear I have read a couple of old threads. Yet I am not sure if it
 safe to
  failback to the old master in case of async replication without base
 backup.
 
  Considering:
  I have the latest 9.3 server
  A: master
  B: slave
  B is actively connected to A
 
  I shut down A manually with -m fast (it's the default FreeBSD init script
  setting)
  I remove the recovery.conf from B
  I restart B
  I create a recovery.conf on A
  I start A
  I see nothing wrong in the logs
  I go for a lunch
  I shut down B
  I remove the recovery.conf on AI restart A
  I restore the recovery.conf on B
  I start B
  I see nothing wrong in the logs and I see that replication is working
 
  Can I say that my data is safe in this case?
 
  If the answer is yes, is it safe to do this if there was a power outage
 on A
  instead of manual shutdown? Considering that the log says nothing wrong.
 (Of
  course if it complains I'd do base backup from B).

 The threshold question here is whether the original master might have
 written (and thus, perhaps, applied) write-ahead log records that were
 not replayed on the slave.  If A crashed, that is definitely possible,
 so this is definitely not safe.  If A was shut down cleanly, then
 streaming replication *should* take everything up through the shutdown
 checkpoint and replicate those to the standby, which *should* replay
 them.  If all goes according to plan, I think this will work.

 I'm not sure we really have enough safeties to make this robust,
 though: for example, at the point when the shutdown checkpoint is
 written, I believe that the master is no longer accepting new
 connections - so if the connection to the slave is broken before the
 shutdown checkpoint record is replicated, then it's not safe any more,
 but how will we detect that?  And, if you remove recovery.conf on the
 slave, it will abort replay and enter normal running as soon as it
 reaches what it thinks is end-of-WAL, with no cross-check to make sure
 that's really the same was point that the master was actually at.  So
 it strikes me that it might be quite difficult to really have
 confidence that nothing will go wrong.

 I'm definitely not the expert in this area on this mailing list, so
 I'm curious what others think.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company