[GENERAL] Replication stops under certain circumstances

2017-10-20 Thread Kim Rose Carlsen
Hi

I have some strange issues with a postgresql read replica that seems to stop 
replicating under certain circumstances.

Whenever we have changes to our views we have script that drops all views and 
reload them from scratch with the new definitions. The reloading of the views 
happens in a transaction to avoid confusion for everyone using the database. 
When this update gets to the slave it seems there is a chance for a deadlock to 
occur that doesn't get detected. 

As I was trying to reproduce this behavior, I ran into another weird situation 
that I don't entirely understand. The symptom is the same that replication 
stops, but it looks quite different. This example won't reproduce the issue 
reliably, but after a few hours I get a slave that won't continue to replicate 
until I restart it. The queries in the example won't make much sense, and I 
don't know if they can be simplified further and still cause the "desired" 
effect.

Setup:
Launch a new RDS psql instance (9.6.2) on AWS (will be referred to as 
db-master) and create a read replica (will be referred to as db-slave). The 
following options are changed from AWS default:
  
max_standby_streaming_delay=-1
hot_standby_feedback=1


On the master create 2 dummy tables:
create table a  (id serial primary key);
create table b  (id serial primary key);

Setup thread 1 to do work on master:

while true; do psql -h db-master -U postgres db -c 'begin; drop view if exists 
view_a cascade; drop view if exists view_b; drop view if exists view_c; create 
view view_a as select * from a; create view view_b as select * from b; create 
view view_c as select * from view_a join view_b using (id); insert into a 
values (default); insert into b values (default); commit;'; done

Setup thread 2 to do work on Slave:
while true; do psql -h  db-slave -U postgres db -c 'begin; select * from view_c 
order by random() limit 10; select * from view_a order by random() limit 10;'; 
done

Setup thread 3 to do more work on slave:
while true; do psql -h  db-slave -U postgres db -c 'begin; select * from view_b 
order by random() limit 10; select * from view_a order by random() limit 10;'; 
done

Every now and then a deadlock is detected and one connection is aborted, this 
works as expected. But After a while(serveral hours) it becomes impossible to 
connect to db on db-slave and thread 2 and 3 stops producing output. When 
trying to connect the psql client just hangs. However it is possible connect to 
template1 database to get a look on what is going on.



template1=> select * from pg_stat_activity;
-[ RECORD 1 ]+
datid    | 16384
datname  | rdsadmin
pid  | 7891
usesysid | 10
usename  | rdsadmin
application_name | 
client_addr  | 
client_hostname  | 
client_port  | 
backend_start    | 
xact_start   | 
query_start  | 
state_change | 
wait_event_type  | 
wait_event   | 
state    | 
backend_xid  | 
backend_xmin | 
query    | 
-[ RECORD 2 ]+
datid    | 1
datname  | template1
pid  | 11949
usesysid | 16388
usename  | hiper
application_name | psql
client_addr  | 192.168.10.166
client_hostname  | 
client_port  | 41002
backend_start    | 2017-10-20 16:30:26.032745+02
xact_start   | 2017-10-20 16:30:34.306418+02
query_start  | 2017-10-20 16:30:34.306418+02
state_change | 2017-10-20 16:30:34.306421+02
wait_event_type  | 
wait_event   | 
state    | active
backend_xid  | 
backend_xmin | 26891
query    | select * from pg_stat_activity;





There are no active connection except rdsadmin from aws.



template1=> select * from pg_locks;
  locktype  | database | relation | page | tuple | virtualxid | transactionid | 
classid | objid | objsubid | virtualtransaction |  pid  |    mode | 
granted | fastpath 
+--+--+--+---++---+-+---+--++---+-+-+--
 virtualxid |  |  |  |   | 3/929  |   | 
    |   |  | 3/929  |  9640 | ExclusiveLock   | 
t   | t
 relation   |    16390 | 2659 |  |   |    |   | 
    |   |  | 4/829  |  9639 | AccessShareLock | 
t   | t
 relation   |    16390 | 1249 |  |   |    |   | 
    |   |  | 4/829  |  9639 | AccessShareLock | 
t   | t
 virtualxid |  |  |  |   | 4/829  |   | 
    |   |  | 4/829  |  9639 | ExclusiveLock   | 
t   | t
 relation   |    1 |    11695 |  |   |    |   | 
    |   |  | 5/148  | 11949 | AccessShareLock | 
t 

Re: [GENERAL] Replication slot and pg_rewind

2017-06-05 Thread Michael Paquier
On Tue, Jun 6, 2017 at 1:52 PM, Bhattacharyya, Subhro
 wrote:
> Our expectation is that slave will be able to sync with the new master with 
> the help of whatever WALs are present in the new master due to replication 
> slots.
> Can pg_rewind still work without WAL archiving in this scenario.

I see. Yes, the slot on the old primary would keep retaining WAL, and
the promoted standby would stop sending feedback once it has switched
to a new timeline so that should work. Don't forget to drop the drop
on the old primary after pg_rewind has been run, you don't want to
bloat its pg_xlog with useless data.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication slot and pg_rewind

2017-06-05 Thread Bhattacharyya, Subhro
Our cluster works as follows:

We do not promote the slave while the primary is up.

During an update scenario, when the master goes down the slave is promoted to 
master only if there is no replication lag.

As a result, we do not have any data difference till now.

Transactions now continue on the newly promoted master thus creating a 
difference in data on the two nodes.

When the original master, post update comes back as slave, instead of taking a 
pg_basebackup, we use pg_rewind.

Our expectation is that slave will be able to sync with the new master with the 
help of whatever WALs are present in the new master due to replication slots.

Can pg_rewind still work without WAL archiving in this scenario.

Thanks, Subhro

-Original Message-
From: Michael Paquier [mailto:michael.paqu...@gmail.com] 
Sent: Tuesday, June 6, 2017 8:50 AM
To: Bhattacharyya, Subhro <s.bhattachar...@sap.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Replication slot and pg_rewind

On Tue, Jun 6, 2017 at 12:03 PM, Bhattacharyya, Subhro
<s.bhattachar...@sap.com> wrote:
> We are using the replication slot and pg_rewind feature of postgresql 9.6
> Our cluster consists of 1 master and 1 slave node.
>
> The replication slot feature allows the master to keep as much WAL as is
> required by the slave.
>
> The pg_rewind command uses WALs to bring the slave in sync with the master.
> By using replication slots there are always enough WAL in the pg_xlog.
>
> In this case is it safe to use pg_rewind without WAL archiving?
> Can there be a situation where pg_rewind fails?

When pg_rewind runs it looks at the WAL from the last checkpoint
before WAL diverged on the *target* node, not the source. So retaining
the WAL data on the primary after the standby has been promoted makes
little sense from this point of view. Even worse, once the promoted
standby decides to recycle the past WAL segments you won't be able to
do a rewind of the previous primary because there is no way to know
what are the blocks modified on the standby since the point of
divergence.
-- 
Michael

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication slot and pg_rewind

2017-06-05 Thread Michael Paquier
On Tue, Jun 6, 2017 at 12:03 PM, Bhattacharyya, Subhro
 wrote:
> We are using the replication slot and pg_rewind feature of postgresql 9.6
> Our cluster consists of 1 master and 1 slave node.
>
> The replication slot feature allows the master to keep as much WAL as is
> required by the slave.
>
> The pg_rewind command uses WALs to bring the slave in sync with the master.
> By using replication slots there are always enough WAL in the pg_xlog.
>
> In this case is it safe to use pg_rewind without WAL archiving?
> Can there be a situation where pg_rewind fails?

When pg_rewind runs it looks at the WAL from the last checkpoint
before WAL diverged on the *target* node, not the source. So retaining
the WAL data on the primary after the standby has been promoted makes
little sense from this point of view. Even worse, once the promoted
standby decides to recycle the past WAL segments you won't be able to
do a rewind of the previous primary because there is no way to know
what are the blocks modified on the standby since the point of
divergence.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication slot and pg_rewind

2017-06-05 Thread Bhattacharyya, Subhro
We are using the replication slot and pg_rewind feature of postgresql 9.6

Our cluster consists of 1 master and 1 slave node.

The replication slot feature allows the master to keep as much WAL as is 
required by the slave.

The pg_rewind command uses WALs to bring the slave in sync with the master.
By using replication slots there are always enough WAL in the pg_xlog.

In this case is it safe to use pg_rewind without WAL archiving?
Can there be a situation where pg_rewind fails?

Regards, Subhro


Re: [GENERAL] replication slot to be used in the future

2017-01-04 Thread Tom DalPozzo
2017-01-04 18:24 GMT+01:00 Adrian Klaver :

> On 01/04/2017 08:44 AM, Tom DalPozzo wrote:
>
>> Hi,
>>
>
> Postgres version?
>
> Because in 9.6:
>
> https://www.postgresql.org/docs/9.6/static/functions-admin.h
> tml#FUNCTIONS-REPLICATION
>
> Table 9-82. Replication SQL Functions
>
> pg_create_physical_replication_slot(slot_name name [, immediately_reserve
> boolean ])
>
> Creates a new physical replication slot named slot_name. The optional
> second parameter, when true, specifies that the LSN for this replication
> slot be reserved immediately; otherwise the LSN is reserved on first
> connection from a streaming replication client. Streaming changes from a
> physical slot is only possible with the streaming-replication protocol —
> see Section 51.3. This function corresponds to the replication protocol
> command CREATE_REPLICATION_SLOT ... PHYSICAL.
>
>
>>
Hi, it is 9.5.5!
Thank you very much!

Pupillo


Re: [GENERAL] replication slot to be used in the future

2017-01-04 Thread Adrian Klaver

On 01/04/2017 08:44 AM, Tom DalPozzo wrote:

Hi,


Postgres version?

Because in 9.6:

https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-REPLICATION

Table 9-82. Replication SQL Functions

pg_create_physical_replication_slot(slot_name name [, 
immediately_reserve boolean ])


Creates a new physical replication slot named slot_name. The optional 
second parameter, when true, specifies that the LSN for this replication 
slot be reserved immediately; otherwise the LSN is reserved on first 
connection from a streaming replication client. Streaming changes from a 
physical slot is only possible with the streaming-replication protocol — 
see Section 51.3. This function corresponds to the replication protocol 
command CREATE_REPLICATION_SLOT ... PHYSICAL.



I've got my primary and I make a pg_basebackup -x in order to create a
standby.
I can connect my standby only later, in some hours, so I'd like the
master to keep new WALs but I don't like to use archiving nor
keep-segments option. I thought to do it through a physical replication
slot (my standby will have its replication slot name).
So I create a physical replication slot but I see that the master, which
has never seen my standby connected to him, doesn't keep WALs.

Any idea?

Regards
Pupillo







--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] replication slot to be used in the future

2017-01-04 Thread Tom DalPozzo
Hi,
I've got my primary and I make a pg_basebackup -x in order to create a
standby.
I can connect my standby only later, in some hours, so I'd like the master
to keep new WALs but I don't like to use archiving nor keep-segments
option. I thought to do it through a physical replication slot (my standby
will have its replication slot name).
So I create a physical replication slot but I see that the master, which
has never seen my standby connected to him, doesn't keep WALs.

Any idea?

Regards
Pupillo


[GENERAL] Re: [GENERAL] [GENERAL] Replication between différent versions of the same OS.

2016-12-01 Thread Benoit Lobréau
Thanks a lot. That s what I was looking for ;)

Yes, I was trying to avoid logical replication. I guess it s time for me to
delve into it...


Re: [GENERAL] [GENERAL] Replication between différent versions of the same OS.

2016-12-01 Thread Jehan-Guillaume de Rorthais
On Thu, 1 Dec 2016 20:11:06 +0100
Benoit Lobréau  wrote:

> Hi,
> 
> Is it possible to use the built in replication to replicate between two
> PostgreSQL in the same version but in different version of the same OS (Say
> Pg 9.1 Ubuntu 12 to Pg 9.1 Ubuntu 14)
> 
> I think I read in Hackers that since PostgreSQL uses the OS libraries for
> encoding. It could cause silent corruption because the encoding might be
> different between versions of the OS. But I cant find the email again so I
> can't find the exact context ... maybe I dreamed it ..

No, you did not. See:
https://www.postgresql.org/message-id/flat/BA6132ED-1F6B-4A0B-AC22-81278F5AB81E%40tripadvisor.com#ba6132ed-1f6b-4a0b-ac22-81278f5ab...@tripadvisor.com

> We would like to replicate to the other server then upgrade to a newer
> version. (the import is too long)

Use logical replication. You could use Slony for example.

Regards,


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] [GENERAL] Replication between différent versions of the same OS.

2016-12-01 Thread Benoit Lobréau
Hi,

Is it possible to use the built in replication to replicate between two
PostgreSQL in the same version but in different version of the same OS (Say
Pg 9.1 Ubuntu 12 to Pg 9.1 Ubuntu 14)

I think I read in Hackers that since PostgreSQL uses the OS libraries for
encoding. It could cause silent corruption because the encoding might be
different between versions of the OS. But I cant find the email again so I
can't find the exact context ... maybe I dreamed it ..

We would like to replicate to the other server then upgrade to a newer
version. (the import is too long)


Thanks
Benoit


[GENERAL] Replication of a database or schemas from a database

2016-11-22 Thread Johann Spies
We would like to have a master(read/write) version of a database  (or a
schema or two) on one server and a readonly version of of the same
database.  The only changed on the second one may be to duplicate changes
to views, materialized_views and indexes that also happened on the first
one.

We work with different versions of data of which the content version in
production will not change except for the changes described in the previous
paragraph.

About all the replication/load sharing solutions I have read about work on
the cluster/server-level.

I have seen one person referring to Slony or Londiste for a situation like
this but also referring that it might be an in house option from PG 9.5
onwards.

Do I have to use something like Slony or Londiste or can it be done with a
standard 9.6 installation?

We would like read queries to be run on both servers in a distributed way
if possible.

Recommendations for a solution would be welcomed.

Regards
Johann

-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: [GENERAL] replication setup: advice needed

2016-11-04 Thread Dmitry Karasik
> You need to look at a replication solution like Slony, which is a trigger
> based replication solution. If you are using PostgreSQL version 9.4 or
> higher, then, you can explore "pglogical" which is WAL based and uses
> logical decoding capability.

I'm using 9.4, and I'm looking at pglogical as well -- thank you!. I'll
try to experiment how much it fits my needs.


-- 
Sincerely,
Dmitry Karasik



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] replication setup: advice needed

2016-11-04 Thread Venkata B Nagothi
On Thu, Nov 3, 2016 at 8:17 PM, Dmitry Karasik 
wrote:

> Dear all,
>
> I'd like to ask for help or advice with choosing the best replication
> setup for
> my task.
>
> I need to listen to continuous inserts/deletes/updates over a set of
> tables,
> and serve them over http, so I would like to off-load this procedure to a
> separate slave machine.  I thought that logical master-slave replication
> could
> be the best match here, but I couldn't find enough details in the
> documentation
> which implementation would match my needs best.
>

Which version of PostgreSQL are you using ?


>
> Basically, I need to:
>
> a) replicate selected tables to a hot standby slave
> b) on the slave, listen for the insert/update/delete events (either
> through triggers or logical decoder plugin)
>
> While I see that a) should be feasible, I can't see if it's possible to do
> b) at all.
> Also, with so many replication solutions, I don't want to test them all
> one by one, but
> rather would like to ask for help choosing the one goes best here -- and
> if there's none,
> an alternative setup then.
>

You need to look at a replication solution like Slony, which is a trigger
based replication solution. If you are using PostgreSQL version 9.4 or
higher, then, you can explore "pglogical" which is WAL based and uses
logical decoding capability.

If you are just looking at replicating specific tables, then either of the
above solutions would work fine.

Regards,

Venkata B N
Database Consultant

Fujitsu Australia


[GENERAL] replication setup: advice needed

2016-11-03 Thread Dmitry Karasik
Dear all,

I'd like to ask for help or advice with choosing the best replication setup for
my task.

I need to listen to continuous inserts/deletes/updates over a set of tables,
and serve them over http, so I would like to off-load this procedure to a
separate slave machine.  I thought that logical master-slave replication could
be the best match here, but I couldn't find enough details in the documentation
which implementation would match my needs best.

Basically, I need to:

a) replicate selected tables to a hot standby slave
b) on the slave, listen for the insert/update/delete events (either through 
triggers or logical decoder plugin)

While I see that a) should be feasible, I can't see if it's possible to do b) 
at all.
Also, with so many replication solutions, I don't want to test them all one by 
one, but
rather would like to ask for help choosing the one goes best here -- and if 
there's none,
an alternative setup then.

-- 
Sincerely,
Dmitry Karasik



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication (BDR) problem: won't catch up after connection timeout

2016-11-02 Thread Craig Ringer
See also https://github.com/2ndQuadrant/bdr/issues/233


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication (BDR) problem: won't catch up after connection timeout

2016-11-02 Thread Craig Ringer
Increase wal_sender_timeout to resolve the issue.

I've been investigating just this issue recently. See
https://www.postgresql.org/message-id/camsr+ye2dsfhvr7iev1gspzihitwx-pmkd9qalegctya+sd...@mail.gmail.com
.

It would be very useful to me to know more about the transaction that
caused this problem.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication (BDR) problem: won't catch up after connection timeout

2016-11-02 Thread Suomela Tero
Hi there,

We have some problems with BDR and would appreciate any hints and advice with 
it. Here's the short story:

We are testing BDR with PostgreSQL 9.4 and it seems to work quite ok after 
getting it up and running, but we ran into a quite disturbing weakness also. A 
basic two node cluster breaks simply by making a transaction that takes a bit 
long to process. Here's an example:

On node1 the application is processing some data which causes around 100k 
select & insert statements within one transaction. This takes some time and the 
replication says "timeout" (assumably there is a keep-alive mechanism which 
simply doesn't work while the transaction is processed). After the transaction 
is committed on node1, the log on node1 says:

< 2016-11-02 13:06:29.117 EET >LOG:  terminating walsender process due to 
replication timeout
< 2016-11-02 13:06:34.168 EET >LOG:  starting logical decoding for slot 
"bdr_64344_6300833630798326204_1_77037__"
< 2016-11-02 13:06:34.168 EET >DETAIL:  streaming transactions committing after 
0/117FCE38, reading WAL from 0/117FCE38
< 2016-11-02 13:06:34.172 EET >LOG:  logical decoding found consistent point at 
0/117FCE38
< 2016-11-02 13:06:34.172 EET >DETAIL:  There are no running transactions.
< 2016-11-02 13:09:09.196 EET >ERROR:  data stream ended
< 2016-11-02 13:09:09.206 EET >LOG:  worker process: bdr 
(6300843528307178977,1,64344,)->bdr (6300833630798326204,1, (PID 28195) exited 
with exit code 1
< 2016-11-02 13:09:14.209 EET >LOG:  starting background worker process "bdr 
(6300843528307178977,1,64344,)->bdr (6300833630798326204,1,"
< 2016-11-02 13:09:14.217 EET >NOTICE:  version "1.0" of extension "btree_gist" 
is already installed
< 2016-11-02 13:09:14.219 EET >NOTICE:  version "1.0.1.0" of extension "bdr" is 
already installed
< 2016-11-02 13:09:14.241 EET >INFO:  starting up replication from 5 at 
0/D038EC8

Checking the BDR status:

< 2016-11-02 13:09:29.632 EET >LOG:  statement: SELECT node_name, node_status 
FROM bdr.bdr_nodes;
< 2016-11-02 13:09:29.633 EET >LOG:  statement: SELECT conn_sysid, conn_dboid, 
conn_dsn FROM bdr.bdr_connections;
< 2016-11-02 13:09:29.633 EET >LOG:  statement: SELECT slot_name, database, 
active, pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn) 
AS retained_bytes FROM pg_replication_slots WHERE plugin = 'bdr';
< 2016-11-02 13:09:29.635 EET >LOG:  statement: SELECT pid, application_name, 
client_addr, state, pg_xlog_location_diff(pg_current_xlog_insert_location(), 
flush_location) AS lag_bytes FROM pg_stat_replication;

Result:

node_name | node_status
---+-
node1 | r
node2 | r
(2 rows)

 conn_sysid  | conn_dboid |conn_dsn
-++-
6300843528307178977 |  64344 | host=192.168.150.11 port=5432 dbname=test 
user=test password=test
6300833630798326204 |  77037 | host=192.168.150.12 port=5432 dbname=test 
user=test password=test
(2 rows)

slot_name| database  | active | retained_bytes
-+---++
bdr_64344_6300833630798326204_1_77037__ | test  | t  |   64939984
(1 row)

  pid  |  application_name  |  client_addr   |  state  
| lag_bytes
---+++-+---
28825 | bdr (6300833630798326204,1,77037,):receive | 192.168.150.12 | catchup | 
 64939984
(1 row)

The node2 state is 'catchup' with lots to catch up, but nothing happens, it 
stays like this even that the connection looks ok. So the data is not 
replicated anymore.

Then if we restart node2, node1 log starts saying:

< 2016-11-02 13:11:06.656 EET >ERROR:  replication slot 
"bdr_64344_6300833630798326204_1_77037__" is already active for pid 28517

Then if we restart both nodes (requires kill -9 for the wal sender process, 
otherwise it won't stop), node1 log:

< 2016-11-02 13:17:01.318 EET >LOG:  shutting down
< 2016-11-02 13:17:01.343 EET >LOG:  database system is shut down
< 2016-11-02 13:18:03.288 EET >LOG:  server process (PID 28682) was terminated 
by signal 9: Killed
< 2016-11-02 13:18:03.288 EET >LOG:  terminating any other active server 
processes
< 2016-11-02 13:18:03.288 EET >LOG:  abnormal database system shutdown
< 2016-11-02 13:18:25.067 EET >LOG:  database system was shut down at 
2016-11-02 13:17:01 EET
< 2016-11-02 13:18:25.087 EET >LOG:  starting up replication identifier with 
ckpt at 0/155EB520
< 2016-11-02 13:18:25.087 EET >LOG:  recovered replication state of node 1 to 
0/1BC8620
< 2016-11-02 13:18:25.087 EET >LOG:  recovered replication state of node 2 to 
0/1E932C8
< 2016-11-02 13:18:25.087 EET >LOG:  recovered replication state of node 3 to 
0/252FBB8
< 2016-11-02 13:18:25.087 EET >LOG:  recovered replication state of node 4 to 
0/294BA20
< 2016-11-02 

Re: [GENERAL] Replication rolling back to normal.

2016-10-23 Thread Michael Paquier
On Mon, Oct 24, 2016 at 2:20 PM, Dasitha Karunajeewa
 wrote:
> Hi Michael,
>
> Thanks a lot for the information. I am totally new to Postgres clustering. I
> have follow up the below-attached article.
> https://www.digitalocean.com/community/tutorials/how-to-set-up-master-slave-replication-on-postgresql-on-an-ubuntu-12-04-vps
>
> Can u guide me on how to do it as u mentioned :)

Documentation is your best friend:
https://www.postgresql.org/docs/9.6/static/high-availability.html
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication rolling back to normal.

2016-10-22 Thread Michael Paquier
On Fri, Oct 21, 2016 at 10:10 PM, Dasitha Karunajeewa
 wrote:
> What I want to know is how to switch them back to the normal status. That
> means pgmaster need to be the Master server which acept the writed and
> pgslave that accepts only reads along with the replication.

If your promoted standby got ahead of the former master, you could use
the promoted standby as a new master, and replug the former master as
a standby. The latter step can be done by either running pg_rewind on
the former masfer or taking a new base backup from the promoted
standby and use that to set up a new standby. pg_rewind will be
successful to run only if the master has kept WAL segments from the
last checkpoint record where WAL forked when standby has been
promoted.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication rolling back to normal.

2016-10-21 Thread Dasitha Karunajeewa
Dear Team,

I have installed PostgreSQL 9.6 on two servers. One is master and other is
for slave server. Current setup as follows.


   - Master Server - pgmaster
   - Salve Server - pgslave

To implement this I have followed this article
https://www.digitalocean.com/community/tutorials/how-to-set-up-master-slave-replication-on-postgresql-on-an-ubuntu-12-04-vps

Master Slave combination is working fine.

when I create a file at the file trigger location ex:
/tmp/postgresql.trigger.5432 on *pgslave *server, it started accepting read
and as well as write request which is perfectly correct.

What I want to know is how to switch them back to the normal status. That
means pgmaster need to be the Master server which acept the writed and
pgslave that accepts only reads along with the replication.


Regards,

Dasitha.


Re: [GENERAL] Replication slot on master failure

2016-09-26 Thread Michael Paquier
On Mon, Sep 26, 2016 at 7:49 PM, hariprasath nallasamy
 wrote:
>We are using replication slot for capturing some change sets to
> update dependent tables.
>Will there be inconsistency if the master fails and the standby takes
> the role of master.?

Replication slot creation is not replicated to standbys if that's what
you are looking for. So if you use a slot on master and consume its
data up to a given point, and then promote a standby, you may see a
gap of data after creating a slot, or if you created a slot previously
you may consume twice the same WAL records. Note that replication
slots created on standbys are initialized from the last checkpoint
redo record, so you could take advantage of this property before
promoting a standby.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication slot on master failure

2016-09-26 Thread hariprasath nallasamy
Hi all
   We are using replication slot for capturing some change sets to
update dependent tables.

   Will there be inconsistency if the master fails and the standby
takes the role of master.?


cheers
-harry


Re: [GENERAL] Replication Recommendation

2016-09-12 Thread Vick Khera
On Mon, Sep 12, 2016 at 3:46 PM, Lee Hachadoorian
 wrote:
> * Because database is updated infrequently, workforce can come
> together for LAN-based replication as needed
> * Entire database is on the order of a few GB

Just update one copy, then send pg_dump's to the others for stomping
over the old one.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication Recommendation

2016-09-12 Thread Adrian Klaver

On 09/12/2016 02:35 PM, Lee Hachadoorian wrote:

On Mon, Sep 12, 2016 at 5:12 PM, Adrian Klaver
 wrote:

On 09/12/2016 12:46 PM, Lee Hachadoorian wrote:


There are a wide variety of Postgres replication solutions, and I
would like advice on which one would be appropriate to my use case.

* Small (~half dozen) distributed workforce using a file sharing
service, but without access to direct network connection over the
internet
* Database is updated infrequently, when new government agency data
releases replace old data
* Because database is updated infrequently, workforce can come
together for LAN-based replication as needed
* Entire database is on the order of a few GB

Given this, I am considering the super lowtech "replication" solution
of updating "master" and doing a full database drop and restore on the
"slaves". But I would like to know which of the other (real)
replication solutions might work for this use case.



If I follow correctly the layout is?:

Main database <--- Govt. data
|
|
   \ /

   File share
|
|
   \ /

DB   DBDB   DB   DBDB

User 1   User 2User 3   User 4   User 5User 6



For your simple scenario you might want to look at:

https://www.postgresql.org/docs/9.5/static/app-pgbasebackup.html



That diagram is what I am proposing.

pg_basebackup looks interesting. My initial impression is that the
main gain would be for a multiple database cluster. Are there other
advantages to using this in preference to a full DB dump and restore
if all of our data will be in a single database?


Not sure.

pg_basebackup can:

"There is no guarantee that all WAL files required for the backup are 
archived at the end of backup. If you are planning to use the backup for 
an archive recovery and want to ensure that all required files are 
available at that moment, you need to include them into the backup by 
using -x option."


At that point you have a complete $DATADIR. So on your user machines it 
then becomes a matter of stopping the server clearing out the old 
$DATADIR and dropping the new one in place and starting the server. 
Whether that is faster then having pg_restore connect to a database and 
then process the dump file is something you will have to test.




Best,
--Lee




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication Recommendation

2016-09-12 Thread Lee Hachadoorian
On Mon, Sep 12, 2016 at 5:12 PM, Adrian Klaver
 wrote:
> On 09/12/2016 12:46 PM, Lee Hachadoorian wrote:
>>
>> There are a wide variety of Postgres replication solutions, and I
>> would like advice on which one would be appropriate to my use case.
>>
>> * Small (~half dozen) distributed workforce using a file sharing
>> service, but without access to direct network connection over the
>> internet
>> * Database is updated infrequently, when new government agency data
>> releases replace old data
>> * Because database is updated infrequently, workforce can come
>> together for LAN-based replication as needed
>> * Entire database is on the order of a few GB
>>
>> Given this, I am considering the super lowtech "replication" solution
>> of updating "master" and doing a full database drop and restore on the
>> "slaves". But I would like to know which of the other (real)
>> replication solutions might work for this use case.
>
>
> If I follow correctly the layout is?:
>
> Main database <--- Govt. data
> |
> |
>\ /
>
>File share
> |
> |
>\ /
>
> DB   DBDB   DB   DBDB
>
> User 1   User 2User 3   User 4   User 5User 6
>
>
>
> For your simple scenario you might want to look at:
>
> https://www.postgresql.org/docs/9.5/static/app-pgbasebackup.html
>

That diagram is what I am proposing.

pg_basebackup looks interesting. My initial impression is that the
main gain would be for a multiple database cluster. Are there other
advantages to using this in preference to a full DB dump and restore
if all of our data will be in a single database?

Best,
--Lee


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication Recommendation

2016-09-12 Thread Adrian Klaver

On 09/12/2016 12:46 PM, Lee Hachadoorian wrote:

There are a wide variety of Postgres replication solutions, and I
would like advice on which one would be appropriate to my use case.

* Small (~half dozen) distributed workforce using a file sharing
service, but without access to direct network connection over the
internet
* Database is updated infrequently, when new government agency data
releases replace old data
* Because database is updated infrequently, workforce can come
together for LAN-based replication as needed
* Entire database is on the order of a few GB

Given this, I am considering the super lowtech "replication" solution
of updating "master" and doing a full database drop and restore on the
"slaves". But I would like to know which of the other (real)
replication solutions might work for this use case.


If I follow correctly the layout is?:

Main database <--- Govt. data
|
|
   \ /

   File share
|
|
   \ /

DB   DBDB   DB   DBDB

User 1   User 2User 3   User 4   User 5User 6



For your simple scenario you might want to look at:

https://www.postgresql.org/docs/9.5/static/app-pgbasebackup.html




Regards,
--Lee




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication Recommendation

2016-09-12 Thread Lee Hachadoorian
There are a wide variety of Postgres replication solutions, and I
would like advice on which one would be appropriate to my use case.

* Small (~half dozen) distributed workforce using a file sharing
service, but without access to direct network connection over the
internet
* Database is updated infrequently, when new government agency data
releases replace old data
* Because database is updated infrequently, workforce can come
together for LAN-based replication as needed
* Entire database is on the order of a few GB

Given this, I am considering the super lowtech "replication" solution
of updating "master" and doing a full database drop and restore on the
"slaves". But I would like to know which of the other (real)
replication solutions might work for this use case.

Regards,
--Lee

-- 
Lee Hachadoorian
Assistant Professor of Instruction, Geography and Urban Studies
Assistant Director, Professional Science Master's in GIS
Temple University


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication with non-read-only standby.

2016-07-12 Thread Nick Babadzhanian
Thanks.

I ended up using pglogical, since I don't really need Bi-directional 
replication and docs for UDR suggest using pglogical instead.
Although I ran into a problem there, but pglogical seems to be the answer.

Regards,
Nick.

- Original Message -
From: "Sylvain Marechal" <marechal.sylva...@gmail.com>
To: "Nick Babadzhanian" <n...@cobra.ru>
Cc: "pgsql-general" <pgsql-general@postgresql.org>
Sent: Wednesday, July 6, 2016 11:00:05 PM
Subject: Re: [GENERAL] Replication with non-read-only standby.

2016-06-30 15:15 GMT+02:00 Nick Babadzhanian <n...@cobra.ru>:

> Setup:
> 2 PostgreSQL servers are geographically spread. The first one is used for
> an application that gathers data. It is connected to the second database
> that is used to process the said data. Connection is not very stable nor is
> it fast, so using Bidirectional replication is not an option. It is OK if
> data is shipped in batches rather than streamed.
>
> Question:
> Is there a way to make the standby server non-read-only, so that it can
> keep getting updates (mostly inserts) from the 'master', but users are able
> to edit the data stored on 'slave'? Is there some alternative solution to
> this?
>
> Regards,
> Nick.
>
> Hi Nick,

sorry for this silly question, but I am not sure to understand why BDR is
not an option.
As far as I know, it was designed to handle such cases.

My 2 cents,
Sylvain


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication with non-read-only standby.

2016-07-01 Thread Moreno Andreo

  
  
Il 01/07/2016 05:21, Venkata Balaji N
  ha scritto:


  

  On Thu, Jun 30, 2016 at 11:15 PM,
Nick Babadzhanian  wrote:
Setup:
  2 PostgreSQL servers are geographically spread. The first
  one is used for an application that gathers data. It is
  connected to the second database that is used to process
  the said data. Connection is not very stable nor is it
  fast, so using Bidirectional replication is not an option.
  It is OK if data is shipped in batches rather than
  streamed.
  
  Question:
  Is there a way to make the standby server non-read-only,
  so that it can keep getting updates (mostly inserts) from
  the 'master', but users are able to edit the data stored
  on 'slave'? Is there some alternative solution to this?



You can consider Ruby replication for such a
  requirement. I think, there is no much development
  happening around Ruby Replication since long time i
  believe. This can be used for production environment.


http://www.rubyrep.org/





Regards,
Venkata B N


Fujitsu Australia
  

  


I'm using rubyrep actively in the last 5 years, and that's what in
my experience

The replicator is very good and stable, easy as 1-2-3 to configure
(if you don't need special features), but the project is almost dead
(I've seen no updates since 2009 and no responses in forums since
2011). 
I've tried many times to email the author because of PG 9.1 changes
in bytea management that caused BLOB corruption while replicating,
but never had response, so ended hiring a Ruby developer to fix
things.

One more thing: rubyrep is OK if you want to replicate ONE database,
I've never seen it working on more than 1 database or a whole
cluster. Of course you can run more than one instance, but will be
harder to manage.
If replicating on *nix I'd prefer Bucardo or Slony

Remember, when using async replication with unreliable network, that
your replication can fall far behind "actual" data and this can lead
to conflicts, that must be resolved.
Not to mention the huge memory consumption when working with large
data types and when replication queues get quite big (>300k
rows). In this cases, if JVM memory cap is not large enough(I
reached 2 GB), rubyrep is likely to stop for OutOfMemoryException

My 50 cents
Cheers,
Moreno.
  





Re: [GENERAL] Replication with non-read-only standby.

2016-06-30 Thread Scott Marlowe
On Thu, Jun 30, 2016 at 7:15 AM, Nick Babadzhanian  wrote:
> Setup:
> 2 PostgreSQL servers are geographically spread. The first one is used for an 
> application that gathers data. It is connected to the second database that is 
> used to process the said data. Connection is not very stable nor is it fast, 
> so using Bidirectional replication is not an option. It is OK if data is 
> shipped in batches rather than streamed.
>
> Question:
> Is there a way to make the standby server non-read-only, so that it can keep 
> getting updates (mostly inserts) from the 'master', but users are able to 
> edit the data stored on 'slave'? Is there some alternative solution to this?

I'd probably solve this with slony.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication with non-read-only standby.

2016-06-30 Thread Nick Babadzhanian
Setup:
2 PostgreSQL servers are geographically spread. The first one is used for an 
application that gathers data. It is connected to the second database that is 
used to process the said data. Connection is not very stable nor is it fast, so 
using Bidirectional replication is not an option. It is OK if data is shipped 
in batches rather than streamed.

Question:
Is there a way to make the standby server non-read-only, so that it can keep 
getting updates (mostly inserts) from the 'master', but users are able to edit 
the data stored on 'slave'? Is there some alternative solution to this?

Regards,
Nick.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication

2016-06-06 Thread Bertrand Paquet
Hi,

Thx you for answering.

Regards,

Bertrand

2016-06-06 10:22 GMT+02:00 Vik Fearing :

> On 06/06/16 09:54, Masahiko Sawada wrote:
> > On Sat, Jun 4, 2016 at 10:58 PM, Vik Fearing  wrote:
> >> On 02/06/16 15:32, Bertrand Paquet wrote:
> >>> Hi,
> >>>
> >>> On an hot standby streaming server, is there any way to know, in SQL,
> to
> >>> know the ip of current master ?
> >>
> >> No.
> >>
> >>> The solution I have is to read the recovery.conf file to find
> >>> primary_conninfo,
> >>
> >> That is currently the only solution.  There are plans to allow SQL
> >> access to the parameters in recovery.conf (or to merge them into
> >> postgresql.conf) but that's not currently possible.
> >
> > It might not be a right way but how about using pg_read_file()?
> > postgres(1)=# select regexp_replace(pg_read_file('recovery.conf'),
> > '.*primary_conninfo = (.*)', '\1');
> >   regexp_replace
> > ---
> >  'host=localhost port=5550 application_name=node1'+
> >
> > (1 row)
> >
> > You can get the master server information via SQL from standby server.
>
> This is a good idea, but suffers the same problem that Bertrand has with
> looking at the file a different way: if the file was changed but the
> standby server has not been restarted, it's (potentially) not going to
> be the correct information.
> --
> Vik Fearing  +33 6 46 75 15 36
> http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
>


Re: [GENERAL] Replication

2016-06-06 Thread Vik Fearing
On 06/06/16 09:54, Masahiko Sawada wrote:
> On Sat, Jun 4, 2016 at 10:58 PM, Vik Fearing  wrote:
>> On 02/06/16 15:32, Bertrand Paquet wrote:
>>> Hi,
>>>
>>> On an hot standby streaming server, is there any way to know, in SQL, to
>>> know the ip of current master ?
>>
>> No.
>>
>>> The solution I have is to read the recovery.conf file to find
>>> primary_conninfo,
>>
>> That is currently the only solution.  There are plans to allow SQL
>> access to the parameters in recovery.conf (or to merge them into
>> postgresql.conf) but that's not currently possible.
> 
> It might not be a right way but how about using pg_read_file()?
> postgres(1)=# select regexp_replace(pg_read_file('recovery.conf'),
> '.*primary_conninfo = (.*)', '\1');
>   regexp_replace
> ---
>  'host=localhost port=5550 application_name=node1'+
> 
> (1 row)
> 
> You can get the master server information via SQL from standby server.

This is a good idea, but suffers the same problem that Bertrand has with
looking at the file a different way: if the file was changed but the
standby server has not been restarted, it's (potentially) not going to
be the correct information.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication

2016-06-06 Thread Masahiko Sawada
On Sat, Jun 4, 2016 at 10:58 PM, Vik Fearing  wrote:
> On 02/06/16 15:32, Bertrand Paquet wrote:
>> Hi,
>>
>> On an hot standby streaming server, is there any way to know, in SQL, to
>> know the ip of current master ?
>
> No.
>
>> The solution I have is to read the recovery.conf file to find
>> primary_conninfo,
>
> That is currently the only solution.  There are plans to allow SQL
> access to the parameters in recovery.conf (or to merge them into
> postgresql.conf) but that's not currently possible.
>

It might not be a right way but how about using pg_read_file()?
postgres(1)=# select regexp_replace(pg_read_file('recovery.conf'),
'.*primary_conninfo = (.*)', '\1');
  regexp_replace
---
 'host=localhost port=5550 application_name=node1'+

(1 row)

You can get the master server information via SQL from standby server.

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication

2016-06-04 Thread Vik Fearing
On 02/06/16 18:39, John R Pierce wrote:
> On 6/2/2016 6:32 AM, Bertrand Paquet wrote:
>> On an hot standby streaming server, is there any way to know, in SQL,
>> to know the ip of current master ?
>> The solution I have is to read the recovery.conf file to find
>> primary_conninfo, but, it can be false.
> 
> "The IP" assumes there is only one...   hosts can be multihomed,
> postgres can be listening on numerous interfaces, there is no 'the IP'

That's nice, but a standby is only connecting to one.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication

2016-06-04 Thread Vik Fearing
On 02/06/16 15:32, Bertrand Paquet wrote:
> Hi,
> 
> On an hot standby streaming server, is there any way to know, in SQL, to
> know the ip of current master ?

No.

> The solution I have is to read the recovery.conf file to find
> primary_conninfo,

That is currently the only solution.  There are plans to allow SQL
access to the parameters in recovery.conf (or to merge them into
postgresql.conf) but that's not currently possible.

> but, it can be false.

It would only be wrong if recovery.conf has been edited since the stanby
was last restarted.

There are hooks for connections and disconnections of the walreceiver,
so it should be possible and fairly simple to write an extension that
remembers and exposes the primary_conninfo in effect.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication

2016-06-02 Thread John R Pierce

On 6/2/2016 6:32 AM, Bertrand Paquet wrote:
On an hot standby streaming server, is there any way to know, in SQL, 
to know the ip of current master ?
The solution I have is to read the recovery.conf file to find 
primary_conninfo, but, it can be false.


"The IP" assumes there is only one...   hosts can be multihomed, 
postgres can be listening on numerous interfaces, there is no 'the IP'


--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication

2016-06-02 Thread Sameer Kumar
On Thu, 2 Jun 2016, 10:34 p.m. Scott Mead,  wrote:

> On Thu, Jun 2, 2016 at 10:16 AM, Melvin Davidson 
> wrote:
>
>> It's been a few years since I worked with slony, and you did not state
>> which version of slony or PostgreSQL you are working with, nor did you
>> indicate the O/S.
>>
>
> I think OP had pointed to using streaming
>
>
>> That being said, you should be able to formulate a query with a join
>> between sl_path & sl_node that gives you the information you need.
>>
>> On Thu, Jun 2, 2016 at 9:32 AM, Bertrand Paquet <
>> bertrand.paq...@doctolib.fr> wrote:
>>
>>> Hi,
>>>
>>> On an hot standby streaming server, is there any way to know, in SQL, to
>>> know the ip of current master ?
>>> The solution I have is to read the recovery.conf file to find
>>> primary_conninfo, but, it can be false.
>>>
>>>
> I've run into this as well.  Only way is recovery.conf.
>

9.6 onward you will have a new view which will facilitate you to query the
replication details on standby.

I have not tried but probably you can check the pid of wal receiver and
find out what host it is connected to (should be possible from network
stats).


> --Scott
>
>
>
>> Regards,
>>>
>>> Bertrand
>>>
>>
>>
>>
>> --
>> *Melvin Davidson*
>> I reserve the right to fantasize.  Whether or not you
>> wish to share my fantasy is entirely up to you.
>>
>
>
>
> --
> --
> Scott Mead
> Sr. Architect
> *OpenSCG *
> http://openscg.com
>
-- 
--
Best Regards
Sameer Kumar | DB Solution Architect
*ASHNIK PTE. LTD.*

101 Cecil Street, #11-11 Tong Eng Building, Singapore 069 533

T: +65 6438 3504 | M: +65 8110 0350 | www.ashnik.com


Re: [GENERAL] Replication

2016-06-02 Thread Scott Mead
On Thu, Jun 2, 2016 at 10:16 AM, Melvin Davidson 
wrote:

> It's been a few years since I worked with slony, and you did not state
> which version of slony or PostgreSQL you are working with, nor did you
> indicate the O/S.
>

I think OP had pointed to using streaming


> That being said, you should be able to formulate a query with a join
> between sl_path & sl_node that gives you the information you need.
>
> On Thu, Jun 2, 2016 at 9:32 AM, Bertrand Paquet <
> bertrand.paq...@doctolib.fr> wrote:
>
>> Hi,
>>
>> On an hot standby streaming server, is there any way to know, in SQL, to
>> know the ip of current master ?
>> The solution I have is to read the recovery.conf file to find
>> primary_conninfo, but, it can be false.
>>
>>
I've run into this as well.  Only way is recovery.conf.

--Scott



> Regards,
>>
>> Bertrand
>>
>
>
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.
>



-- 
--
Scott Mead
Sr. Architect
*OpenSCG *
http://openscg.com


Re: [GENERAL] Replication

2016-06-02 Thread Melvin Davidson
It's been a few years since I worked with slony, and you did not state
which version of slony or PostgreSQL you are working with, nor did you
indicate the O/S.
That being said, you should be able to formulate a query with a join
between sl_path & sl_node that gives you the information you need.

On Thu, Jun 2, 2016 at 9:32 AM, Bertrand Paquet  wrote:

> Hi,
>
> On an hot standby streaming server, is there any way to know, in SQL, to
> know the ip of current master ?
> The solution I have is to read the recovery.conf file to find
> primary_conninfo, but, it can be false.
>
> Regards,
>
> Bertrand
>



-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


[GENERAL] Replication

2016-06-02 Thread Bertrand Paquet
Hi,

On an hot standby streaming server, is there any way to know, in SQL, to
know the ip of current master ?
The solution I have is to read the recovery.conf file to find
primary_conninfo, but, it can be false.

Regards,

Bertrand


Re: [GENERAL] Replication Question

2016-02-02 Thread Bala Venkat
Thanks for the help. We need an upgrade on the DB for the solution. I
checked your suggestion and it works on versions from 9.1 and above

Regards

On Thu, Jan 28, 2016 at 12:04 PM, Andreas Kretschmer <
akretsch...@spamfence.net> wrote:

> Bala Venkat  wrote:
>
> > Hi there -
> >
> >We have a set up where there is One master streaming to 3
> Slaves .
> > 2 slaves are in our DR environment. One is the prod environment.
> >
> >   Wanted to make the DR as primary. I know we can make the one
> of the
> > slave in DR to primary. If I want to keep the other slave as slave
> connecting
> > to the new Master DR , is it possible ? or I have to set the whole
> streaming
> > process again in DR ?  Can you please share your experience ?
>
> should be possible, but you have to change the recovery.conf to point
> out to the new master, and, of course, the new master should stream the
> wals (wal_sender, proper pg_hba.conf and so on).
>
>
> Andreas
> --
> Really, I'm not out to destroy Microsoft. That will just be a completely
> unintentional side effect.  (Linus Torvalds)
> "If I was god, I would recompile penguin with --enable-fly."   (unknown)
> Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] Replication Question

2016-01-28 Thread Andreas Kretschmer
Bala Venkat  wrote:

> Hi there -
> 
>    We have a set up where there is One master streaming to 3 Slaves . 
> 2 slaves are in our DR environment. One is the prod environment. 
> 
>   Wanted to make the DR as primary. I know we can make the one of the
> slave in DR to primary. If I want to keep the other slave as slave connecting
> to the new Master DR , is it possible ? or I have to set the whole streaming
> process again in DR ?  Can you please share your experience ?

should be possible, but you have to change the recovery.conf to point
out to the new master, and, of course, the new master should stream the
wals (wal_sender, proper pg_hba.conf and so on).


Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication Question

2016-01-28 Thread Bala Venkat
Hi there -

   We have a set up where there is One master streaming to 3 Slaves
.  2 slaves are in our DR environment. One is the prod environment.

  Wanted to make the DR as primary. I know we can make the one of
the slave in DR to primary. If I want to keep the other slave as slave
connecting to the new Master DR , is it possible ? or I have to set the
whole streaming process again in DR ?  Can you please share your experience
?


Regards


Re: [GENERAL] Replication with 9.4

2015-12-01 Thread Thomas Munro
On Tue, Oct 6, 2015 at 12:27 PM, Thomas Munro  wrote:

> On Sun, Oct 4, 2015 at 11:47 PM, Michael Paquier
>  wrote:
> > (Seems like you forgot to push the Reply-all button)
> >
> > On Sun, Oct 4, 2015 at 7:01 PM, Madovsky wrote:
> >> On 10/3/2015 3:30 PM, Michael Paquier wrote:
> >>>  and no reason is given to justify *why* this would be needed in your
> case
> >> reason for a choice can be often an issue for other :D
> >>
> >> I thought that postgresql 9.4  user could change on the fly with
> >> synchronous_commit from local to on for ex
> >> which hotstandby would become in sync and which in async to avoid a big
> >> latency in case of let's say 100 hot standby.
> >> it was an idea, a concept to let the master write and update the nodes,
> like
> >> a queen bee ;)
> >> but I'm afraid it's not possible, so maybe future version of pg will do
> it,
> >> for now  read from the master is my only solution.
> >
> > Well, Thomas Munro (adding him in CC) has sent for integration with
> > 9.6 a patch that would cover your need, by adding to
> > synchronous_commit a mode called 'apply', in which case a master would
> > wait for the transaction to be applied on standby before committing
> > locally:
> >
> http://www.postgresql.org/message-id/CAEepm=1fqkivl4v-otphwsgw4af9hcogimrcw-ybtjipx9g...@mail.gmail.com
> > Perhaps you could help with the review of the patch, this has stalled
> > a bit lately.
>
> That patch (or something more sophisticated long those lines) is a
> small piece of a bigger puzzle, though it might be enough if you only
> have one standby, are prepared to block until manual intervention if
> that standby fails, and don't mind potentially lumpy apply
> performance.  See also the work being done to separate wal writing
> from wal applying for smoother performance[1], and handle multiple
> synchronous standbys[2].  But there is another piece of the puzzle
> IMHO: how to know reliably that the standby that you are talking to
> guarantees causal consistency, while also allowing standbys to
> fail/drop out gracefully, and I'm currently working on an idea for
> that.
>

FYI I posted the resulting proposal and patch over on the -hackers list.
Feedback, ideas, flames welcome as always.

http://www.postgresql.org/message-id/flat/CAEepm=0n_OxB2_pNntXND6aD85v5PvADeUY8eZjv9CBLk=z...@mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com


Re: [GENERAL] Replication with 9.4

2015-10-05 Thread Thomas Munro
On Sun, Oct 4, 2015 at 11:47 PM, Michael Paquier
 wrote:
> (Seems like you forgot to push the Reply-all button)
>
> On Sun, Oct 4, 2015 at 7:01 PM, Madovsky wrote:
>> On 10/3/2015 3:30 PM, Michael Paquier wrote:
>>>  and no reason is given to justify *why* this would be needed in your case
>> reason for a choice can be often an issue for other :D
>>
>> I thought that postgresql 9.4  user could change on the fly with
>> synchronous_commit from local to on for ex
>> which hotstandby would become in sync and which in async to avoid a big
>> latency in case of let's say 100 hot standby.
>> it was an idea, a concept to let the master write and update the nodes, like
>> a queen bee ;)
>> but I'm afraid it's not possible, so maybe future version of pg will do it,
>> for now  read from the master is my only solution.
>
> Well, Thomas Munro (adding him in CC) has sent for integration with
> 9.6 a patch that would cover your need, by adding to
> synchronous_commit a mode called 'apply', in which case a master would
> wait for the transaction to be applied on standby before committing
> locally:
> http://www.postgresql.org/message-id/CAEepm=1fqkivl4v-otphwsgw4af9hcogimrcw-ybtjipx9g...@mail.gmail.com
> Perhaps you could help with the review of the patch, this has stalled
> a bit lately.

That patch (or something more sophisticated long those lines) is a
small piece of a bigger puzzle, though it might be enough if you only
have one standby, are prepared to block until manual intervention if
that standby fails, and don't mind potentially lumpy apply
performance.  See also the work being done to separate wal writing
from wal applying for smoother performance[1], and handle multiple
synchronous standbys[2].  But there is another piece of the puzzle
IMHO: how to know reliably that the standby that you are talking to
guarantees causal consistency, while also allowing standbys to
fail/drop out gracefully, and I'm currently working on an idea for
that.

Of course you can make your own causal consistency today if you are
prepared to have your clients explicitly wait for WAL to be applied.
You can call pg_current_xlog_location() on the master after
committing, and then wait until pg_last_xlog_replay_location() reports
that that LSN has been applied on any standby you talk to at the start
of any transaction that wants causal reads.  You could wrap the
waiting up in a user defined function
wait_for_xlog_replay_location(lsn, timeout) which could do a naive
poll/sleep loop (or do something more efficient with latches in core
code).  For example, imagine a client that inserts some new accounts
and then causes a large number of workers to regenerate some reports
that must include the new accounts against a pool of standbys.  It
just needs to give them the LSN they should wait for first.  And if
you don't want to pass LSNs around but don't mind introducing some
extra conservative lag, those workers could call
pg_current_xlog_location() on the master themselves to get some
arbitrary recent LSN and then wait for that to be applied before they
start their work on the standbys.

The explicit wait-for-LSN approach pushes the waiting over to readers
who want causal reads, instead of writer (the master), which still
might be interesting for some cases even if we do finish up with a
good optional master-waits system.  But a master-waits system will
allow naive clients to see up to date data no matter where they run
their queries (or be kicked off by standbys that can't guarantee that)
without having to think about LSNs and replication machinery, and I
think that would be a very useful feature.

[1] 
http://www.postgresql.org/message-id/flat/CA+U5nMJifauXvVbx=v3UbYbHO3Jw2rdT4haL6CCooEDM5=4...@mail.gmail.com
[2] 
http://www.postgresql.org/message-id/flat/caog9aphycpmtypaawfd3_v7svokbnecfivmrc1axhb40zbs...@mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication with 9.4

2015-10-04 Thread Michael Paquier
(Seems like you forgot to push the Reply-all button)

On Sun, Oct 4, 2015 at 7:01 PM, Madovsky wrote:
> On 10/3/2015 3:30 PM, Michael Paquier wrote:
>>  and no reason is given to justify *why* this would be needed in your case
> reason for a choice can be often an issue for other :D
>
> I thought that postgresql 9.4  user could change on the fly with
> synchronous_commit from local to on for ex
> which hotstandby would become in sync and which in async to avoid a big
> latency in case of let's say 100 hot standby.
> it was an idea, a concept to let the master write and update the nodes, like
> a queen bee ;)
> but I'm afraid it's not possible, so maybe future version of pg will do it,
> for now  read from the master is my only solution.

Well, Thomas Munro (adding him in CC) has sent for integration with
9.6 a patch that would cover your need, by adding to
synchronous_commit a mode called 'apply', in which case a master would
wait for the transaction to be applied on standby before committing
locally:
http://www.postgresql.org/message-id/CAEepm=1fqkivl4v-otphwsgw4af9hcogimrcw-ybtjipx9g...@mail.gmail.com
Perhaps you could help with the review of the patch, this has stalled
a bit lately.
Regards,
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication with 9.4

2015-10-04 Thread Madovsky



On 10/4/2015 3:47 AM, Michael Paquier wrote:

(Seems like you forgot to push the Reply-all button)

On Sun, Oct 4, 2015 at 7:01 PM, Madovsky wrote:

On 10/3/2015 3:30 PM, Michael Paquier wrote:

  and no reason is given to justify *why* this would be needed in your case

reason for a choice can be often an issue for other :D

I thought that postgresql 9.4  user could change on the fly with
synchronous_commit from local to on for ex
which hotstandby would become in sync and which in async to avoid a big
latency in case of let's say 100 hot standby.
it was an idea, a concept to let the master write and update the nodes, like
a queen bee ;)
but I'm afraid it's not possible, so maybe future version of pg will do it,
for now  read from the master is my only solution.

Well, Thomas Munro (adding him in CC) has sent for integration with
9.6 a patch that would cover your need, by adding to
synchronous_commit a mode called 'apply', in which case a master would
wait for the transaction to be applied on standby before committing
locally:
http://www.postgresql.org/message-id/CAEepm=1fqkivl4v-otphwsgw4af9hcogimrcw-ybtjipx9g...@mail.gmail.com
Perhaps you could help with the review of the patch, this has stalled
a bit lately.
Regards,

Brilliant, however I'm not to caught everything in this thread.
I would love to contribute to this patch, but I have absolutely no
C/C++ experience (webdev languages only).
to tell simple, when SET LOCAL synchronous_commit TO ON is used on the fly
for the current transaction it would be great to have an other option to 
choose
if this setting is for all standby or a specific one, leaving the other 
async... gotcha? :)







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication with 9.4

2015-10-03 Thread Michael Paquier
On Sat, Oct 3, 2015 at 10:20 PM, Madovsky  wrote:
>
>
> On 10/3/2015 4:48 AM, Michael Paquier wrote:
>>
>> On Sat, Oct 3, 2015 at 8:09 PM, Madovsky  wrote:
>>>
>>> I would like to fix a issue I'm facing of with the version 9.4 streaming
>>> replication.
>>> is it possible to set on the fly the synchronous commit on the master (or
>>> standby?)
>>> which only sync commit the hot standby node used by the client who has a
>>> read only sql session on?
>>
>> By referring to the docs:
>>
>> http://www.postgresql.org/docs/devel/static/warm-standby.html#SYNCHRONOUS-REPLICATION
>> Synchronous replication gives the insurance that a transaction has
>> been flushed to the disk of the standby which is in sync, aka the one
>> with the lowest priority depending on the nodes currently connected.
>> This does not ensure that the transaction has been *replayed* on the
>> standby. You are sure that the transaction data is available. Hence if
>> you wish to know that a transaction in a standby is running a
>> transaction with enough data replayed, you should make the WAL
>> position of the master necessary for the transaction of the standby
>> something that your application is aware of.
>
>
> I really well understood Michael thanks,
> the docs doesn't cover if the sync priorities can be changed
> so one node can be considered fully sync and the other only async
> thus to minimize sync request overhead...

The amount of overhead of a node is something that needs to be
evaluated externally of the Postgres backend, then you could always
adjust synchronous_standby_names to change the priorities as you wish.
You can for example do so with libpq or psql using ALTER SYSTEM
combined with "SELECT pg_reload_conf();". The configuration will be be
reloaded at the next query loop in a backup once it catches the
changes of the parameter via SIGHUP.

> usually a client connect to a node would like to see the results
> on the node where  he has a session on.
> I just wanted to avoid a SELECT request to the master and
> stay on the HOT STANDBY for all read requests.
> my script open 2 session, on on the master and one on the hot standby
> in case of block transactions.

Requesting the master would be necessary, still I don't really get why
you don't want to query the master for read queries... You could for
example plug on top of the master pgbouncer if you have many
connections, but well at this stage I have no idea of what is your use
case.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication with 9.4

2015-10-03 Thread Michael Paquier
On Sat, Oct 3, 2015 at 8:09 PM, Madovsky  wrote:
> I would like to fix a issue I'm facing of with the version 9.4 streaming
> replication.
> is it possible to set on the fly the synchronous commit on the master (or
> standby?)
> which only sync commit the hot standby node used by the client who has a
> read only sql session on?

By referring to the docs:
http://www.postgresql.org/docs/devel/static/warm-standby.html#SYNCHRONOUS-REPLICATION
Synchronous replication gives the insurance that a transaction has
been flushed to the disk of the standby which is in sync, aka the one
with the lowest priority depending on the nodes currently connected.
This does not ensure that the transaction has been *replayed* on the
standby. You are sure that the transaction data is available. Hence if
you wish to know that a transaction in a standby is running a
transaction with enough data replayed, you should make the WAL
position of the master necessary for the transaction of the standby
something that your application is aware of.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication with 9.4

2015-10-03 Thread Edson Richter


 Madovsky escreveu 

> Hi,
> 
> I would like to fix a issue I'm facing of with the version 9.4 streaming 
> replication.
> is it possible to set on the fly the synchronous commit on the master 
> (or standby?)
> which only sync commit the hot standby node used by the client who has a 
> read only sql session on?
> example:
> node1 node2 node3 are hot standby with replication slots (physical) on 
> master 1
> node1 has the priority 1 since it's first on the names list
> now a client open a ready only session on node3 and start a transaction 
> update 

Sorry for my ignorance, but can you start a "transaction update" over a read 
only connection?

Edson


but
> within this transaction a sql select is done on the hot standby from the 
> updated table
> 
> thanks
> 
> Franck
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Replication with 9.4

2015-10-03 Thread Madovsky

Hi,

I would like to fix a issue I'm facing of with the version 9.4 streaming 
replication.
is it possible to set on the fly the synchronous commit on the master 
(or standby?)
which only sync commit the hot standby node used by the client who has a 
read only sql session on?

example:
node1 node2 node3 are hot standby with replication slots (physical) on 
master 1

node1 has the priority 1 since it's first on the names list
now a client open a ready only session on node3 and start a transaction 
update but
within this transaction a sql select is done on the hot standby from the 
updated table


thanks

Franck


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication with 9.4

2015-10-03 Thread Madovsky


Hi

On 10/3/2015 5:46 AM, Edson Richter wrote:


 Madovsky escreveu 

> Hi,
>
> I would like to fix a issue I'm facing of with the version 9.4 
streaming

> replication.
> is it possible to set on the fly the synchronous commit on the master
> (or standby?)
> which only sync commit the hot standby node used by the client who 
has a

> read only sql session on?
> example:
> node1 node2 node3 are hot standby with replication slots (physical) on
> master 1
> node1 has the priority 1 since it's first on the names list
> now a client open a ready only session on node3 and start a transaction
> update

Sorry for my ignorance, but can you start a "transaction update" over 
a *read only connection?*


Edson

but
> within this transaction a sql select is done on the hot standby from 
the

> updated table
>
> thanks
>
> Franck
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org 
)

> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Obviously not, my script open 2 sessions


Re: [GENERAL] Replication with 9.4

2015-10-03 Thread Madovsky



On 10/3/2015 4:48 AM, Michael Paquier wrote:

On Sat, Oct 3, 2015 at 8:09 PM, Madovsky  wrote:

I would like to fix a issue I'm facing of with the version 9.4 streaming
replication.
is it possible to set on the fly the synchronous commit on the master (or
standby?)
which only sync commit the hot standby node used by the client who has a
read only sql session on?

By referring to the docs:
http://www.postgresql.org/docs/devel/static/warm-standby.html#SYNCHRONOUS-REPLICATION
Synchronous replication gives the insurance that a transaction has
been flushed to the disk of the standby which is in sync, aka the one
with the lowest priority depending on the nodes currently connected.
This does not ensure that the transaction has been *replayed* on the
standby. You are sure that the transaction data is available. Hence if
you wish to know that a transaction in a standby is running a
transaction with enough data replayed, you should make the WAL
position of the master necessary for the transaction of the standby
something that your application is aware of.


I really well understood Michael thanks,
the docs doesn't cover if the sync priorities can be changed
so one node can be considered fully sync and the other only async
thus to minimize sync request overhead...
usually a client connect to a node would like to see the results
on the node where  he has a session on.
I just wanted to avoid a SELECT request to the master and
stay on the HOT STANDBY for all read requests.
my script open 2 session, on on the master and one on the hot standby
in case of block transactions.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication with 9.4

2015-10-03 Thread Michael Paquier
On Sun, Oct 4, 2015 at 6:38 AM, Madovsky wrote:
> On 10/3/2015 6:55 AM, Michael Paquier wrote:
>> On Sat, Oct 3, 2015 at 10:20 PM, Madovsky wrote:
>> Requesting the master would be necessary, still I don't really get why
>> you don't want to query the master for read queries... You could for
>> example plug on top of the master pgbouncer if you have many
>> connections, but well at this stage I have no idea of what is your use
>> case.
>
>
> Your idea is interesting, but unfortunately not dynamic and not for a per
> user basis.
> like we can change synchronous_commit on the fly and per block transactions
> so
> why not the same for standby priority?
> I'm trying to use the master for write only.

There can be only one sync standby at a time, and please note again
the difference between WAL flush and WAL replay. Synchonous
replication ensures that the former has been done, not the latter. As
far as this thread goes, it seems to me that you are taking a more
bug-prone approach on your application-side which could be solved by
just querying the master... Another idea would be to add some
meta-data in the schema to check the validity of the things replayed,
like a state-of-data-per-user or a data-version-number-per-user in a
dedicated table, this would avoid having to make the application
WAL-aware, still you would need to make the application aware of this
meta-data in some way, which would surely require to query the master
or update some cache or a file on client side. Still I don't see why
you would need to use this approach, and no reason is given to justify
*why* this would be needed in your case. But well I guess I cannot
stop you to do so if you wish to do it :)
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication with 9.4

2015-10-03 Thread Madovsky



On 10/3/2015 6:55 AM, Michael Paquier wrote:

On Sat, Oct 3, 2015 at 10:20 PM, Madovsky  wrote:


On 10/3/2015 4:48 AM, Michael Paquier wrote:

On Sat, Oct 3, 2015 at 8:09 PM, Madovsky  wrote:

I would like to fix a issue I'm facing of with the version 9.4 streaming
replication.
is it possible to set on the fly the synchronous commit on the master (or
standby?)
which only sync commit the hot standby node used by the client who has a
read only sql session on?

By referring to the docs:

http://www.postgresql.org/docs/devel/static/warm-standby.html#SYNCHRONOUS-REPLICATION
Synchronous replication gives the insurance that a transaction has
been flushed to the disk of the standby which is in sync, aka the one
with the lowest priority depending on the nodes currently connected.
This does not ensure that the transaction has been *replayed* on the
standby. You are sure that the transaction data is available. Hence if
you wish to know that a transaction in a standby is running a
transaction with enough data replayed, you should make the WAL
position of the master necessary for the transaction of the standby
something that your application is aware of.


I really well understood Michael thanks,
the docs doesn't cover if the sync priorities can be changed
so one node can be considered fully sync and the other only async
thus to minimize sync request overhead...

The amount of overhead of a node is something that needs to be
evaluated externally of the Postgres backend, then you could always
adjust synchronous_standby_names to change the priorities as you wish.
You can for example do so with libpq or psql using ALTER SYSTEM
combined with "SELECT pg_reload_conf();". The configuration will be be
reloaded at the next query loop in a backup once it catches the
changes of the parameter via SIGHUP.


usually a client connect to a node would like to see the results
on the node where  he has a session on.
I just wanted to avoid a SELECT request to the master and
stay on the HOT STANDBY for all read requests.
my script open 2 session, on on the master and one on the hot standby
in case of block transactions.

Requesting the master would be necessary, still I don't really get why
you don't want to query the master for read queries... You could for
example plug on top of the master pgbouncer if you have many
connections, but well at this stage I have no idea of what is your use
case.


Your idea is interesting, but unfortunately not dynamic and not for a 
per user basis.
like we can change synchronous_commit on the fly and per block 
transactions so

why not the same for standby priority?
I'm trying to use the master for write only.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication fell out of sync

2015-03-02 Thread David Kerr
Howdy,

I had an instance where a replica fell out of sync with the master.

Now it's in in a state where it's unable to catch up because the master has 
already removed the WAL segment.

(logs)
Mar  2 23:10:13 db13 postgres[11099]: [3-1] user=,db=,host= LOG:  streaming 
replication successfully connected to primary
Mar  2 23:10:13 db13 postgres[11099]: [4-1] user=,db=,host= FATAL:  could not 
receive data from WAL stream: FATAL:  requested WAL segment 
0006047C001F has already been removed


I was under the impression that when you setup streaming replication if you 
specify a restore command like : restore_command = 'cp /arch/%f %p'

Then even if the slave falls out of sync, and the master removes the WAL 
segment, as long as you can still retrieve the WAL files, then it can bring 
itself back into sync.


But that doesn't seem to be happening.

The restore_command is working
# Slave's $PGDATA/pg_xlog/
-rw--- 1 postgres postgres 16777216 Mar  2 21:29 0006047C001F
-rwx-- 1 postgres postgres 16777216 Mar  2 23:13 RECOVERYXLOG

I'm on PG 9.2.7, which i know is old, but I'm upgrading shortly.

recovery.conf:
standby_mode  = 'on'
primary_conninfo  = 'host=pgmaster port=5432'
restore_command   = 'cp /arch/%f %p'

relevant info from postgresql.conf:
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = 32
archive_mode = on
hot_standby = on
hot_standby_feedback = true


I know that to avoid this entirely I need to set wal_keep_segments higher, 
although in this particular case it wouldn't have mattered because a rogue 
program slammed the database and basically 32/64/128 WAL segments went by in a 
short span of time.

However, I really thought that as long as PG could get the archived logs i'd be 
able to recover. 

Was I wrong with that assertion or did i just run into a bug?

Thanks


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication fell out of sync

2015-03-02 Thread Joshua D. Drake


On 03/02/2015 03:25 PM, David Kerr wrote:


Howdy,

I had an instance where a replica fell out of sync with the master.

Now it's in in a state where it's unable to catch up because the master has 
already removed the WAL segment.

(logs)
Mar  2 23:10:13 db13 postgres[11099]: [3-1] user=,db=,host= LOG:  streaming 
replication successfully connected to primary
Mar  2 23:10:13 db13 postgres[11099]: [4-1] user=,db=,host= FATAL:  could not 
receive data from WAL stream: FATAL:  requested WAL segment 
0006047C001F has already been removed


I was under the impression that when you setup streaming replication if you 
specify a restore command like : restore_command = 'cp /arch/%f %p'

Then even if the slave falls out of sync, and the master removes the WAL 
segment, as long as you can still retrieve the WAL files, then it can bring 
itself back into sync.


If the archive command is also set so that the restore command has a 
file to retrieve, then yes it will work that way.



--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc

Now I get it: your service is designed for a customer
base that grew up with Facebook, watches Japanese seizure
robot anime, and has the attention span of a gnat.
I'm not that user., Tyler Riddle



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication fell out of sync

2015-03-02 Thread David Kerr
On Mon, Mar 02, 2015 at 04:06:02PM PDT, Adrian Klaver wrote:
 On 03/02/2015 03:25 PM, David Kerr wrote:
 Howdy,
 
 I had an instance where a replica fell out of sync with the master.
 
 Now it's in in a state where it's unable to catch up because the master has 
 already removed the WAL segment.
 
 (logs)
 Mar  2 23:10:13 db13 postgres[11099]: [3-1] user=,db=,host= LOG:  streaming 
 replication successfully connected to primary
 Mar  2 23:10:13 db13 postgres[11099]: [4-1] user=,db=,host= FATAL:  could 
 not receive data from WAL stream: FATAL:  requested WAL segment 
 0006047C001F has already been removed
 
 
 I was under the impression that when you setup streaming replication if you 
 specify a restore command like : restore_command = 'cp /arch/%f %p'
 
 Then even if the slave falls out of sync, and the master removes the WAL 
 segment, as long as you can still retrieve the WAL files, then it can bring 
 itself back into sync.
 
 
 But that doesn't seem to be happening.
 
 The restore_command is working
 # Slave's $PGDATA/pg_xlog/
 -rw--- 1 postgres postgres 16777216 Mar  2 21:29 0006047C001F
 -rwx-- 1 postgres postgres 16777216 Mar  2 23:13 RECOVERYXLOG
 
 Trying to figure out why the error occurred at Mar  2 23:10:13 and
 the file shows a time stamp of Mar  2 21:29, especially since you
 say the WAL segments flew past?
 
 Are there any other WAL files in the slave ~/pg_xlog?

Turns out just that file had gotten corrupt on copy. When i re-pulled it from 
source life was good.

(phew!)

Thanks all for looking.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication fell out of sync

2015-03-02 Thread Adrian Klaver

On 03/02/2015 03:25 PM, David Kerr wrote:

Howdy,

I had an instance where a replica fell out of sync with the master.

Now it's in in a state where it's unable to catch up because the master has 
already removed the WAL segment.

(logs)
Mar  2 23:10:13 db13 postgres[11099]: [3-1] user=,db=,host= LOG:  streaming 
replication successfully connected to primary
Mar  2 23:10:13 db13 postgres[11099]: [4-1] user=,db=,host= FATAL:  could not 
receive data from WAL stream: FATAL:  requested WAL segment 
0006047C001F has already been removed


I was under the impression that when you setup streaming replication if you 
specify a restore command like : restore_command = 'cp /arch/%f %p'

Then even if the slave falls out of sync, and the master removes the WAL 
segment, as long as you can still retrieve the WAL files, then it can bring 
itself back into sync.


But that doesn't seem to be happening.

The restore_command is working
# Slave's $PGDATA/pg_xlog/
-rw--- 1 postgres postgres 16777216 Mar  2 21:29 0006047C001F
-rwx-- 1 postgres postgres 16777216 Mar  2 23:13 RECOVERYXLOG


Trying to figure out why the error occurred at Mar  2 23:10:13 and the 
file shows a time stamp of Mar  2 21:29, especially since you say the 
WAL segments flew past?


Are there any other WAL files in the slave ~/pg_xlog?



I'm on PG 9.2.7, which i know is old, but I'm upgrading shortly.

recovery.conf:
standby_mode  = 'on'
primary_conninfo  = 'host=pgmaster port=5432'
restore_command   = 'cp /arch/%f %p'

relevant info from postgresql.conf:
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = 32
archive_mode = on
hot_standby = on
hot_standby_feedback = true


I know that to avoid this entirely I need to set wal_keep_segments higher, 
although in this particular case it wouldn't have mattered because a rogue 
program slammed the database and basically 32/64/128 WAL segments went by in a 
short span of time.

However, I really thought that as long as PG could get the archived logs i'd be 
able to recover.

Was I wrong with that assertion or did i just run into a bug?

Thanks





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication fell out of sync

2015-03-02 Thread David Kerr
On Mon, Mar 02, 2015 at 03:33:22PM PDT, Joshua D. Drake wrote:
 
 On 03/02/2015 03:25 PM, David Kerr wrote:
 
 Howdy,
 
 I had an instance where a replica fell out of sync with the master.
 
 Now it's in in a state where it's unable to catch up because the master has 
 already removed the WAL segment.
 
 (logs)
 Mar  2 23:10:13 db13 postgres[11099]: [3-1] user=,db=,host= LOG:  streaming 
 replication successfully connected to primary
 Mar  2 23:10:13 db13 postgres[11099]: [4-1] user=,db=,host= FATAL:  could 
 not receive data from WAL stream: FATAL:  requested WAL segment 
 0006047C001F has already been removed
 
 
 I was under the impression that when you setup streaming replication if you 
 specify a restore command like : restore_command = 'cp /arch/%f %p'
 
 Then even if the slave falls out of sync, and the master removes the WAL 
 segment, as long as you can still retrieve the WAL files, then it can bring 
 itself back into sync.
 
 If the archive command is also set so that the restore command has a
 file to retrieve, then yes it will work that way.

Yeah it is, it's actually pulling the file down. 

Glad that's how it's supposed to work. I'd rather be unlucky then crazy.  =)



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication: How to query current segments allocation relative to Wal keep segments?

2015-01-05 Thread Edson Carlos Ericksson Richter

On 05-01-2015 02:08, Michael Paquier wrote:

On Sun, Jan 4, 2015 at 1:48 AM, Edson Carlos Ericksson Richter
edsonrich...@hotmail.com wrote:

How to  query current segments allocation relative to Wal keep segments in
each master server?

What is your server version? You can have a look at
pg_stat_replication on the master which contains information about the
WAL segments written, flushed and replayed on each slave:
http://www.postgresql.org/docs/devel/static/monitoring-stats.html#PG-STAT-REPLICATION-VIEW


In production servers, 9.3.5.
In development servers, 9.4.0.

I had a quick look at this view before:

Example (on 9.3.5):
postgres=# select * from pg_stat_replication;
 pid  | usesysid | usename  | application_name | client_addr | 
client_hostname | client_port |backend_start | state   | 
sent_location | write_location

 | flush_location | replay_location | sync_priority | sync_state
--+--+--+--+-+-+-+--+---+---+---
-++-+---+
 2808 |   10 | postgres | walreceiver  | 10.68.73.1 
| |   36075 | 2015-01-03 20:17:48.53706-02 | 
streaming | 22/F94D1A90   | 22/F94D1A90

 | 22/F94D1A90| 22/F94D1A90 | 0 | async
(1 registro)

but in either case (9.3.5, 9.4.0), I get lots of info, but not the count 
of wal_segments consumed.

Would this kind of count being recorded somewhere else?
How does the server knows that the wal_segments have been exhausted?

Thanks,

Edson




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication: How to query current segments allocation relative to Wal keep segments?

2015-01-05 Thread Michael Paquier
On Mon, Jan 5, 2015 at 6:51 PM, Edson Carlos Ericksson Richter
edsonrich...@hotmail.com wrote:
 Would this kind of count being recorded somewhere else?
 How does the server knows that the wal_segments have been exhausted?
You should evaluate the amount of wal_keep_segments necessary using
the replication lag in terms of WAL position differences (LSN) between
the master and its slaves. pg_stat_replication gives you the WAL
position (LSN) up to where each slave has received WAL information.
Combine it with pg_current_xlog_location() to determine what is the
current location master is writing WAL and you can evaluate the number
of WAL files that need to be retained on master. Knowing that each WAL
file is normally 16MB, simply use pg_xlog_location_diff to calculate
the WAL lag as a difference of bytes (for 9.4 a simple difference
operation is possible with the data type pg_lsn), and then guess from
it the number of WAL files that are actually necessary.

If you care that much about WAL retention btw, consider using
replication slots with 9.4, just be careful to monitor the partition
where pg_xlog sits in.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication: How to query current segments allocation relative to Wal keep segments?

2015-01-05 Thread Edson Richter

On 05-01-2015 10:02, Michael Paquier wrote:

On Mon, Jan 5, 2015 at 6:51 PM, Edson Carlos Ericksson Richter
edsonrich...@hotmail.com wrote:

Would this kind of count being recorded somewhere else?
How does the server knows that the wal_segments have been exhausted?

You should evaluate the amount of wal_keep_segments necessary using
the replication lag in terms of WAL position differences (LSN) between
the master and its slaves. pg_stat_replication gives you the WAL
position (LSN) up to where each slave has received WAL information.
Combine it with pg_current_xlog_location() to determine what is the
current location master is writing WAL and you can evaluate the number
of WAL files that need to be retained on master. Knowing that each WAL
file is normally 16MB, simply use pg_xlog_location_diff to calculate
the WAL lag as a difference of bytes (for 9.4 a simple difference
operation is possible with the data type pg_lsn), and then guess from
it the number of WAL files that are actually necessary.

If you care that much about WAL retention btw, consider using
replication slots with 9.4, just be careful to monitor the partition
where pg_xlog sits in.
Despite being a completely valid statement, I've two contrary thoughts 
about it:


1) I cannot migrate production servers at my free will. It requires long 
planning, and probably will happen only in one or two years from now 
(year end 2015 or 2016)
2) I do prefer to monitor how much wal segments I'm really using (and in 
need), and then fix them up to 25% above this limit, than giving a 
chance to blow my disk space (I mean, is preferable to stop replication 
than put in risk whole database because of disk space)


Thanks,

Edson



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication: How to query current segments allocation relative to Wal keep segments?

2015-01-04 Thread Michael Paquier
On Sun, Jan 4, 2015 at 1:48 AM, Edson Carlos Ericksson Richter
edsonrich...@hotmail.com wrote:
 How to  query current segments allocation relative to Wal keep segments in
 each master server?
What is your server version? You can have a look at
pg_stat_replication on the master which contains information about the
WAL segments written, flushed and replayed on each slave:
http://www.postgresql.org/docs/devel/static/monitoring-stats.html#PG-STAT-REPLICATION-VIEW
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication: How to query current segments allocation relative to Wal keep segments?

2015-01-03 Thread Edson Carlos Ericksson Richter
I'm maintaining async replication (streaming) between four database 
servers arranged on 2 x 2.
How to  query current segments allocation relative to Wal keep 
segments in each master server?
I want to add this query to Postbix in order to monitor if the wal keep 
segments parameter is too short, like having an alert at 70%, for example.


Thanks,

Edson



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication failover on windows

2014-05-30 Thread CS_DBA

Hi all;

we have a client running PostgreSQL on windows, and they want to run 
streaming replication with some sort of failover.


We have streaming replication in place, we thought we could use 
pgbouncer and in the case of the master being down our heartbeat script 
would reload the pgbouncer configs however pgbouncer on windows seems to 
never time out connections even when we set it to timeout. So this 
solution is probably a no go.


We are currently looking at IP aliases but it's windows so I have little 
confidence that it will be simple or stable.


Has anyone else deployed a heartbeat/failover solution on windows 
platforms? Any suggestions per an approach that would be seamless to the 
applications?



Thanks in advance





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] replication timeout in pg_basebackup

2014-03-11 Thread Aggarwal, Ajay
Thats exactly what I was thinking after all other experiments. Couple of 
questions:
1) why did you say that 300 seconds is the upper limit? Is this enforced by 
Postgres? What if I want to set it to 10 minutes?
2) whats the downside of bigger replication timeout?

Thanks.

Ajay

From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] 
on behalf of John R Pierce [pie...@hogranch.com]
Sent: Monday, March 10, 2014 9:58 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] replication timeout in pg_basebackup

On 3/9/2014 6:52 PM, Aggarwal, Ajay wrote:
Our replication timeout is default 60 seconds. If we increase the replication 
time to say 180 seconds, we see better results but backups still fail 
occasionally.

so increase it to 300 seconds, or whatever.   thats an upper limit, it needs to 
be big enough that you DONT get into problems when doing stuff like basebackups.






--
john r pierce  37N 122W
somewhere on the middle of the left coast


Re: [GENERAL] replication timeout in pg_basebackup

2014-03-11 Thread John R Pierce

On 3/11/2014 5:50 AM, Aggarwal, Ajay wrote:
Thats exactly what I was thinking after all other experiments. Couple 
of questions:
1) why did you say that 300 seconds is the upper limit? Is this 
enforced by Postgres? What if I want to set it to 10 minutes?

2) whats the downside of bigger replication timeout?



I said, set it to 300 or whatever.   An unfortunate extra comma confused 
my meaning.  What I meant was, whatever you set it to, thats a upper limit.


As I understand it, that timeout is how long the replication can lag the 
server before the server decides to stop replication.   with it at 300, 
under heavy load, the replication could run as much as 5 minutes (300 
seconds) behind before it errors.




--
john r pierce  37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] replication timeout in pg_basebackup

2014-03-10 Thread Aggarwal, Ajay
Thanks Hari Babu.

I think what is happening is that my dirty cache builds up quickly for the 
volume where I am backing up. This would trigger flush of these dirty pages to 
the disk. While this flush is going on pg_basebackup tries to do fsync() on a 
received WAL file and gets blocked.

While in this state, i.e. when dirty page count is high, following are the 
results of pg_test_fsync


# /usr/pgsql-9.2/bin/pg_test_fsync -f /backup/fsync_test
2 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync  16.854 ops/sec
fdatasync  15.242 ops/sec
fsync   0.187 ops/sec
fsync_writethroughn/a
open_sync  14.747 ops/sec

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync   6.137 ops/sec
fdatasync  14.899 ops/sec
fsync   0.007 ops/sec
fsync_writethroughn/a
open_sync   1.450 ops/sec

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
 1 * 16kB open_sync write  13.486 ops/sec
 2 *  8kB open_sync writes  6.006 ops/sec
 4 *  4kB open_sync writes  3.446 ops/sec
 8 *  2kB open_sync writes  1.400 ops/sec
16 *  1kB open_sync writes  0.859 ops/sec

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 0.009 ops/sec
write, close, fsync 0.008 ops/sec

Non-Sync'ed 8kB writes:
write   99415.368 ops/sec


However when backups are not going on and dirty pages count is low, below are 
the results of this test

# /usr/pgsql-9.2/bin/pg_test_fsync -f /backup/fsync_test
2 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync1974.243 ops/sec
fdatasync1410.804 ops/sec
fsync 181.129 ops/sec
fsync_writethroughn/a
open_sync 547.389 ops/sec

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 290.109 ops/sec
fdatasync 962.378 ops/sec
fsync 158.987 ops/sec
fsync_writethroughn/a
open_sync 642.309 ops/sec

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
 1 * 16kB open_sync write1014.456 ops/sec
 2 *  8kB open_sync writes627.964 ops/sec
 4 *  4kB open_sync writes340.313 ops/sec
 8 *  2kB open_sync writes173.581 ops/sec
16 *  1kB open_sync writes103.236 ops/sec

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close   244.670 ops/sec
write, close, fsync   207.248 ops/sec

Non-Sync'ed 8kB writes:
write   202216.900 ops/sec



From: Haribabu Kommi [kommi.harib...@gmail.com]
Sent: Monday, March 10, 2014 1:42 AM
To: Aggarwal, Ajay
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] replication timeout in pg_basebackup

On Mon, Mar 10, 2014 at 12:52 PM, Aggarwal, Ajay 
aaggar...@verizon.commailto:aaggar...@verizon.com wrote:
Our environment: Postgres version 9.2.2 running on CentOS 6.4

Our backups using pg_basebackup are frequently failing with following error

pg_basebackup: could not send feedback packet: server closed the connection 
unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

We are invoking pg_basebackup with these arguments : pg_basebackup -D 
backup_dir -X stream -l backup_dir

In postgres logs we see this log message terminating walsender process due to 
replication timeout.

Our replication timeout is default 60 seconds. If we increase the replication 
time to say 180 seconds, we see better results

Re: [GENERAL] replication timeout in pg_basebackup

2014-03-10 Thread Haribabu Kommi
On Tue, Mar 11, 2014 at 7:07 AM, Aggarwal, Ajay aaggar...@verizon.com wrote:
 Thanks Hari Babu.

 I think what is happening is that my dirty cache builds up quickly for the
 volume where I am backing up. This would trigger flush of these dirty pages
 to the disk. While this flush is going on pg_basebackup tries to do fsync()
 on a received WAL file and gets blocked.

But the sync is executed for every WAL file finish. Does your database
is big in size?
Does your setup is write-heavy operations?

In Linux when it tries to write a bunch of buffers at once, the fysnc
call might block for some time.
In the following link there are some Tuning Recommendations for
write-heavy operations which might be useful to you.

http://www.westnet.com/~gsmith/content/linux-pdflush.htm

Any other ideas to handle these kind of problems?

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] replication timeout in pg_basebackup

2014-03-10 Thread Aggarwal, Ajay
I have already tried experimenting with linux dirty_ratio etc. You can only 
fine tune up to a limit. The backup process still fills up the buffer cache 
very quickly. Yes, my database is about 5-6 GB in size and will grow bigger 
over time.

If wish there was a way to slow down pg_basebackup or force it to use direct 
I/O.

From: Haribabu Kommi [kommi.harib...@gmail.com]
Sent: Monday, March 10, 2014 8:31 PM
To: Aggarwal, Ajay
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] replication timeout in pg_basebackup

On Tue, Mar 11, 2014 at 7:07 AM, Aggarwal, Ajay aaggar...@verizon.com wrote:
 Thanks Hari Babu.

 I think what is happening is that my dirty cache builds up quickly for the
 volume where I am backing up. This would trigger flush of these dirty pages
 to the disk. While this flush is going on pg_basebackup tries to do fsync()
 on a received WAL file and gets blocked.

But the sync is executed for every WAL file finish. Does your database
is big in size?
Does your setup is write-heavy operations?

In Linux when it tries to write a bunch of buffers at once, the fysnc
call might block for some time.
In the following link there are some Tuning Recommendations for
write-heavy operations which might be useful to you.

http://www.westnet.com/~gsmith/content/linux-pdflush.htm

Any other ideas to handle these kind of problems?

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] replication timeout in pg_basebackup

2014-03-10 Thread John R Pierce

On 3/9/2014 6:52 PM, Aggarwal, Ajay wrote:
Our replication timeout is default 60 seconds. If we increase the 
replication time to say 180 seconds, we see better results but backups 
still fail occasionally.


so increase it to 300 seconds, or whatever.   thats an upper limit, it 
needs to be big enough that you DONT get into problems when doing stuff 
like basebackups.






--
john r pierce  37N 122W
somewhere on the middle of the left coast



[GENERAL] replication timeout in pg_basebackup

2014-03-09 Thread Aggarwal, Ajay
Our environment: Postgres version 9.2.2 running on CentOS 6.4

Our backups using pg_basebackup are frequently failing with following error

pg_basebackup: could not send feedback packet: server closed the connection 
unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

We are invoking pg_basebackup with these arguments : pg_basebackup -D 
backup_dir -X stream -l backup_dir

In postgres logs we see this log message terminating walsender process due to 
replication timeout.

Our replication timeout is default 60 seconds. If we increase the replication 
time to say 180 seconds, we see better results but backups still fail 
occasionally.

Running strace on pg_basebackup process, we see that the fsync() call takes 
significant time and could be responsible for causing this timeout in postgres.

Has anybody else run into the same issue? Is there a way to run pg_basebackup 
without fsync() ?




Re: [GENERAL] replication timeout in pg_basebackup

2014-03-09 Thread Haribabu Kommi
On Mon, Mar 10, 2014 at 12:52 PM, Aggarwal, Ajay aaggar...@verizon.comwrote:

  Our environment: Postgres version 9.2.2 running on CentOS 6.4

 Our backups using pg_basebackup are frequently failing with following error

 pg_basebackup: could not send feedback packet: server closed the connection 
 unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.

 We are invoking pg_basebackup with these arguments : pg_basebackup -D 
 backup_dir -X stream -l backup_dir

 In postgres logs we see this log message terminating walsender process
 due to replication timeout.

 Our replication timeout is default 60 seconds. If we increase the
 replication time to say 180 seconds, we see better results but backups
 still fail occasionally.

 Running strace on pg_basebackup process, we see that the fsync() call
 takes significant time and could be responsible for causing this timeout in
 postgres.


Use the pg_test_fsync utility which is available in postgresql contrib
module to test your system sync methods performance.


 Has anybody else run into the same issue? Is there a way to run
 pg_basebackup without fsync() ?


As of now there is no such options available, I feel it is better to find
why the sync is taking time?

Regards,
Hari Babu
Fujitsu Australia


Re: [GENERAL] Replication failed after stalling

2013-12-31 Thread Albe Laurenz
Sergey Konoplev wrote:
 On Mon, Dec 30, 2013 at 12:27 AM, Albe Laurenz laurenz.a...@wien.gv.at 
 wrote:
 Joe Van Dyk wrote:
 If I run COPY (select * from complicate_view) to stdout on the standby, 
 I've noticed that sometimes
 halts replication updates to the slave.

 For example, that's happening right now and now() - 
 pg_last_xact_replay_timestamp() is 22 minutes.
 There's many transactions per second being committed on the master. Once 
 that query is canceled, the
 slave catches up immediately.

 You have hot_standby_feedback = on, right?

 In that case that is expected behaviour.
 Some change on the master conflicted with the query on the standby,
 perhaps with a tuple cleaned up after a HOT update.  Replication will
 stall until the query is done.
 
 IIRC, the applying process is paused but the receiving one is going on
 in this case, isn't it?

Frankly, I don't know.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication failed after stalling

2013-12-30 Thread Albe Laurenz
Joe Van Dyk wrote:
 If I run COPY (select * from complicate_view) to stdout on the standby, 
 I've noticed that sometimes
 halts replication updates to the slave.
 
 For example, that's happening right now and now() - 
 pg_last_xact_replay_timestamp() is 22 minutes.
 There's many transactions per second being committed on the master. Once that 
 query is canceled, the
 slave catches up immediately.

You have hot_standby_feedback = on, right?

In that case that is expected behaviour.
Some change on the master conflicted with the query on the standby,
perhaps with a tuple cleaned up after a HOT update.  Replication will
stall until the query is done.

If that is unacceptable to you and you would rather have queries
canceled on the standby if they take too long, set hot_standby_feedback
back to off and increase vacuum_defer_cleanup_age.

But it is not possible to have a predictable maximum replication lag
and arbitrarily long running queries on the standby at the same time.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication failed after stalling

2013-12-30 Thread Sergey Konoplev
On Mon, Dec 30, 2013 at 12:27 AM, Albe Laurenz laurenz.a...@wien.gv.at wrote:
 Joe Van Dyk wrote:
 If I run COPY (select * from complicate_view) to stdout on the standby, 
 I've noticed that sometimes
 halts replication updates to the slave.

 For example, that's happening right now and now() - 
 pg_last_xact_replay_timestamp() is 22 minutes.
 There's many transactions per second being committed on the master. Once 
 that query is canceled, the
 slave catches up immediately.

 You have hot_standby_feedback = on, right?

 In that case that is expected behaviour.
 Some change on the master conflicted with the query on the standby,
 perhaps with a tuple cleaned up after a HOT update.  Replication will
 stall until the query is done.

IIRC, the applying process is paused but the receiving one is going on
in this case, isn't it?

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication: GZIP compression in WAL sender/receiver processes communication?

2013-12-30 Thread Sergey Konoplev
On Mon, Dec 9, 2013 at 3:13 PM, Dmitry Koterov dmi...@koterov.ru wrote:
 Is there a way to compress the traffic between master and slave during the
 replication?.. The streaming gzip would be quite efficient for that.

Take a look at the ssh_tunnel.sh [1] tool. This is a wrapper around
SSH tunnel with compression, a watchdog and lock management. Very
useful for cross data center streaming.

[1] 
https://github.com/grayhemp/pgcookbook/blob/master/ssh_tunnel_with_compression_setup.md

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication failed after stalling

2013-12-30 Thread Joe Van Dyk
On Mon, Dec 30, 2013 at 10:27 AM, Sergey Konoplev gray...@gmail.com wrote:

 On Mon, Dec 30, 2013 at 12:02 AM, Joe Van Dyk j...@tanga.com wrote:
  On Sun, Dec 29, 2013 at 10:52 PM, Sergey Konoplev gray...@gmail.com
 wrote:
  On Sun, Dec 29, 2013 at 9:56 PM, Joe Van Dyk j...@tanga.com wrote:
   On Wed, Dec 18, 2013 at 3:39 PM, Sergey Konoplev gray...@gmail.com
   wrote:
   If I run COPY (select * from complicate_view) to stdout on the
   standby,
   I've noticed that sometimes halts replication updates to the slave.
  
   For example, that's happening right now and now() -
   pg_last_xact_replay_timestamp() is 22 minutes. There's many
   transactions
   per second being committed on the master. Once that query is canceled,
   the
   slave catches up immediately.
 
  And what
 
  \x
  select * from pg_stat_repication;
 
  shows?
 
  on the master, right?

 Yes.

 And it would be very useful to take a look at your checkpoints and
 replication configuration parameters on both master and replica.


master and replica have same settings.

checkpoint_completion_target: 0.9
checkpoint_segments: 16
checkpoint_timeout: 5m
checkpoint_warning: 30s
hot_standby: on
hot_standby_feedback: on

pid  | 10736
usesysid | 10
usename  | postgres
application_name | walreceiver
client_addr  | the ip
client_hostname  |
client_port  | 47124
backend_start| 2013-12-30 12:08:42.967868-08
state| streaming
sent_location| 410/BC152000
write_location   | 410/BC152000
flush_location   | 410/BC152000
replay_location  | 410/A758B7D0
sync_priority| 0
sync_state   | async


Re: [GENERAL] Replication failed after stalling

2013-12-30 Thread Sergey Konoplev
On Mon, Dec 30, 2013 at 8:56 PM, Joe Van Dyk j...@tanga.com wrote:
 On Mon, Dec 30, 2013 at 10:27 AM, Sergey Konoplev gray...@gmail.com wrote:
 On Mon, Dec 30, 2013 at 12:02 AM, Joe Van Dyk j...@tanga.com wrote:
  On Sun, Dec 29, 2013 at 10:52 PM, Sergey Konoplev gray...@gmail.com
  wrote:
  On Sun, Dec 29, 2013 at 9:56 PM, Joe Van Dyk j...@tanga.com wrote:
   On Wed, Dec 18, 2013 at 3:39 PM, Sergey Konoplev gray...@gmail.com
   wrote:
   If I run COPY (select * from complicate_view) to stdout on the
   standby,
   I've noticed that sometimes halts replication updates to the slave.
 
  \x
  select * from pg_stat_repication;

 And it would be very useful to take a look at your checkpoints and
 replication configuration parameters on both master and replica.

 master and replica have same settings.

 checkpoint_completion_target: 0.9
 checkpoint_segments: 16
 checkpoint_timeout: 5m
 checkpoint_warning: 30s
 hot_standby: on
 hot_standby_feedback: on

I meant all the replication settings, see [1]. And pg_stat_statements
when there is a problem, preferable the error, because when everything
is okay it is not very useful actually.

[1] http://www.postgresql.org/docs/9.3/static/runtime-config-replication.html

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication failed after stalling

2013-12-30 Thread Scott Marlowe
On Wed, Dec 18, 2013 at 1:51 PM, Adrian Klaver adrian.kla...@gmail.com wrote:
 On 12/18/2013 12:15 PM, Joe Van Dyk wrote:

 A possibly related question:

 I've set wal_keep_segments to 10,000 and also have archive_command
 running wal-e. I'm seeing my wal files disappear from pg_xlog after 30
 minutes. Is that expected? Is there a way around that?


 Well a WAL segment is 16MB in size so that should give you a basis for
 determining whether the above is appropriate, my guess it is not. I do not
 know enough about Wal-e, but my guess is it is siphoning off WAL segments
 before you want it to.

Don't some operations like forced checkpoints etc skip to the next WAL
resulting in them not necessarily being full?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication failed after stalling

2013-12-30 Thread Joe Van Dyk
On Mon, Dec 30, 2013 at 9:11 PM, Sergey Konoplev gray...@gmail.com wrote:

 On Mon, Dec 30, 2013 at 8:56 PM, Joe Van Dyk j...@tanga.com wrote:
  On Mon, Dec 30, 2013 at 10:27 AM, Sergey Konoplev gray...@gmail.com
 wrote:
  On Mon, Dec 30, 2013 at 12:02 AM, Joe Van Dyk j...@tanga.com wrote:
   On Sun, Dec 29, 2013 at 10:52 PM, Sergey Konoplev gray...@gmail.com
   wrote:
   On Sun, Dec 29, 2013 at 9:56 PM, Joe Van Dyk j...@tanga.com wrote:
On Wed, Dec 18, 2013 at 3:39 PM, Sergey Konoplev 
 gray...@gmail.com
wrote:
If I run COPY (select * from complicate_view) to stdout on the
standby,
I've noticed that sometimes halts replication updates to the slave.
  
   \x
   select * from pg_stat_repication;
 
  And it would be very useful to take a look at your checkpoints and
  replication configuration parameters on both master and replica.
 
  master and replica have same settings.
 
  checkpoint_completion_target: 0.9
  checkpoint_segments: 16
  checkpoint_timeout: 5m
  checkpoint_warning: 30s
  hot_standby: on
  hot_standby_feedback: on

 I meant all the replication settings, see [1]. And pg_stat_statements
 when there is a problem, preferable the error, because when everything
 is okay it is not very useful actually.


I don't understand, how is pg_stat_statements helpful here, and what error?


 [1]
 http://www.postgresql.org/docs/9.3/static/runtime-config-replication.html


max_wal_senders: 5
wal_keep_segments: 1
wal_sender_timeout: 1m
synchronous_standby_names: n/a
vacuum_defer_cleanup_age: 0
max_standby_archive_delay: 30s
max_standby_streaming_delay: -1
wal_receiver_status_interval: 10s
hot_standby_feedback: on
wal_receiver_timeout: 1m


Re: [GENERAL] Replication failed after stalling

2013-12-30 Thread Sergey Konoplev
On Mon, Dec 30, 2013 at 10:05 PM, Joe Van Dyk j...@tanga.com wrote:
 I meant all the replication settings, see [1]. And pg_stat_statements
 when there is a problem, preferable the error, because when everything
 is okay it is not very useful actually.

 I don't understand, how is pg_stat_statements helpful here, and what error?

The error you showed in the initial email.

My guess is that the master might stop sending WAL records to the
replica, that is why I wanted to check the stat_replication query. Oh,
yes, and I forget to put current_xlog_location in the query. So, the
correct one is below.

\x
select pg_current_xlog_location(), * from pg_stat_replication;

 checkpoint_completion_target: 0.9
 checkpoint_segments: 16
 checkpoint_timeout: 5m
 checkpoint_warning: 30s
[...]
 max_wal_senders: 5
 wal_keep_segments: 1
 vacuum_defer_cleanup_age: 0
 max_standby_archive_delay: 30s
 max_standby_streaming_delay: -1
 wal_receiver_status_interval: 10s
 hot_standby_feedback: on
[...]

That 1 looks weird and I would increase checkpoint_segments and
checkpoint_timeout, but first let us check how often checkpoints and
checkpoint warnings happen on master. You can see it in logs. Turn
log_checkpoints on if it is off.

And also how many WAL your system generates and for what period.

ls -lt /path/to/pg_xlog/ | wc -l
ls -lt /path/to/pg_xlog/ | head
ls -lt /path/to/pg_xlog/ | tail

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication failed after stalling

2013-12-29 Thread Joe Van Dyk
On Wed, Dec 18, 2013 at 3:39 PM, Sergey Konoplev gray...@gmail.com wrote:

 On Wed, Dec 18, 2013 at 11:26 AM, Joe Van Dyk j...@tanga.com wrote:
  I'm running Postgresql 9.3. I have a streaming replication server.
 Someone
  was running a long COPY query (8 hours) on the standby which halted
  replication. The replication stopped at 3:30 am. I canceled the
 long-running
  query at 9:30 am and replication data started catching up.

 What do you mean by COPY on the standby halted replication?


If I run COPY (select * from complicate_view) to stdout on the standby,
I've noticed that sometimes halts replication updates to the slave.

For example, that's happening right now and now() -
pg_last_xact_replay_timestamp() is 22 minutes. There's many transactions
per second being committed on the master. Once that query is canceled, the
slave catches up immediately.

Joe


Re: [GENERAL] Replication failed after stalling

2013-12-29 Thread Sergey Konoplev
On Sun, Dec 29, 2013 at 9:56 PM, Joe Van Dyk j...@tanga.com wrote:
 On Wed, Dec 18, 2013 at 3:39 PM, Sergey Konoplev gray...@gmail.com wrote:

 On Wed, Dec 18, 2013 at 11:26 AM, Joe Van Dyk j...@tanga.com wrote:
  I'm running Postgresql 9.3. I have a streaming replication server.
  Someone
  was running a long COPY query (8 hours) on the standby which halted
  replication. The replication stopped at 3:30 am. I canceled the
  long-running
  query at 9:30 am and replication data started catching up.

 What do you mean by COPY on the standby halted replication?

 If I run COPY (select * from complicate_view) to stdout on the standby,
 I've noticed that sometimes halts replication updates to the slave.

 For example, that's happening right now and now() -
 pg_last_xact_replay_timestamp() is 22 minutes. There's many transactions
 per second being committed on the master. Once that query is canceled, the
 slave catches up immediately.

And what

\x
select * from pg_stat_repication;

shows?

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication: GZIP compression in WAL sender/receiver processes communication?

2013-12-18 Thread Rahila Syed
Hello,

Yes gzip compression can be used for compressing WAL traffic during
streaming replication  Following tools can be used in this regard.
SSL compression-SSL support is built in PostgreSQL. You need to ensure you
have OpenSSL library support in your PostgreSQL installation.
Also, you can compress WAL traffic by setting up SSH tunneling between
master and standby and turn on compression while setting up SSH tunnel.
Following link can be followed for the same.
http://www.postgresql.org/docs/9.3/static/ssh-tunnels.html
Ofcourse, these are the solutions outside PostgreSQL.


On Tue, Dec 10, 2013 at 4:43 AM, Dmitry Koterov dmi...@koterov.ru wrote:

 Hello.

 Is there a way to compress the traffic between master and slave during the
 replication?.. The streaming gzip would be quite efficient for that.

 (WAL archiving is not too good for this purpose because of high lag. I
 just need to minimize the cross-datacenter traffic keeping the replication
 lag low.)



[GENERAL] Replication failed after stalling

2013-12-18 Thread Joe Van Dyk
I'm running Postgresql 9.3. I have a streaming replication server. Someone
was running a long COPY query (8 hours) on the standby which halted
replication. The replication stopped at 3:30 am. I canceled the
long-running query at 9:30 am and replication data started catching up.

The data up until 10 am got restored fine (took until 10:30 am to restore
that much). Then I started getting errors like FATAL:  could not receive
data from WAL stream: ERROR:  requested WAL segment
000103C30086 has already been removed.

I'm confused about how pg could restore data from 3:30 am to 10 am, then
start complaining about missing WAL files.

What's the best way to avoid this problem? Increase wal_keep_segments?

Joe


Re: [GENERAL] Replication failed after stalling

2013-12-18 Thread Joe Van Dyk
A possibly related question:

I've set wal_keep_segments to 10,000 and also have archive_command running
wal-e. I'm seeing my wal files disappear from pg_xlog after 30 minutes. Is
that expected? Is there a way around that?

(I want to use streaming replication and wal-e for PITR restores)


On Wed, Dec 18, 2013 at 11:26 AM, Joe Van Dyk j...@tanga.com wrote:

 I'm running Postgresql 9.3. I have a streaming replication server. Someone
 was running a long COPY query (8 hours) on the standby which halted
 replication. The replication stopped at 3:30 am. I canceled the
 long-running query at 9:30 am and replication data started catching up.

 The data up until 10 am got restored fine (took until 10:30 am to restore
 that much). Then I started getting errors like FATAL:  could not receive
 data from WAL stream: ERROR:  requested WAL segment
 000103C30086 has already been removed.

 I'm confused about how pg could restore data from 3:30 am to 10 am, then
 start complaining about missing WAL files.

 What's the best way to avoid this problem? Increase wal_keep_segments?

 Joe



Re: [GENERAL] Replication failed after stalling

2013-12-18 Thread Jerry Sievers
Joe Van Dyk j...@tanga.com writes:

 I'm running Postgresql 9.3. I have a streaming replication server. Someone 
 was running a long COPY query (8 hours) on the standby which halted 
 replication. The
 replication stopped at 3:30 am. I canceled the long-running query at 9:30 am 
 and replication data started catching up.

 The data up until 10 am got restored fine (took until 10:30 am to restore 
 that much). Then I started getting errors like FATAL:  could not receive 
 data from WAL
 stream: ERROR:  requested WAL segment 000103C30086 has already 
 been removed.

 I'm confused about how pg could restore data from 3:30 am to 10 am, then 
 start complaining about missing WAL files.

 What's the best way to avoid this problem? Increase wal_keep_segments?

Yes and/or implement as a hybrid of streaming and WAL shipping.

Quite simply, your wal_keep segments was almost enough to  get you
through that backlog period but as your standby was catching up, it hit
a point  where  there was a gap.

Depending on how much traffic your master sees at various times of the
day, it's unsurprising that during peak loads, your grace-period is a
lot lower than during off-peak times due to variations in how quickly
WAL segments are filled and cycled over. 

HTH


 Joe


-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net
p: 312.241.7800


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication failed after stalling

2013-12-18 Thread Adrian Klaver

On 12/18/2013 12:15 PM, Joe Van Dyk wrote:

A possibly related question:

I've set wal_keep_segments to 10,000 and also have archive_command
running wal-e. I'm seeing my wal files disappear from pg_xlog after 30
minutes. Is that expected? Is there a way around that?


Well a WAL segment is 16MB in size so that should give you a basis for 
determining whether the above is appropriate, my guess it is not. I do 
not know enough about Wal-e, but my guess is it is siphoning off WAL 
segments before you want it to.





--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication failed after stalling

2013-12-18 Thread Sergey Konoplev
On Wed, Dec 18, 2013 at 11:26 AM, Joe Van Dyk j...@tanga.com wrote:
 I'm running Postgresql 9.3. I have a streaming replication server. Someone
 was running a long COPY query (8 hours) on the standby which halted
 replication. The replication stopped at 3:30 am. I canceled the long-running
 query at 9:30 am and replication data started catching up.

What do you mean by COPY on the standby halted replication?

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication: GZIP compression in WAL sender/receiver processes communication?

2013-12-17 Thread Rahila Syed
Hello,

Yes, gzip compression can be used for compressing WAL traffic during
streaming replication  Following tools can be used in this regard.
SSL compression-SSL support is built in PostgreSQL. You need to ensure
you have OpenSSL library support in your PostgreSQL installation.
Also, you can compress WAL traffic by setting up SSH tunneling between
master and standby and turn on compression while setting up SSH
tunnel. Following link can be followed for the same.
http://www.postgresql.org/docs/9.3/static/ssh-tunnels.html
Ofcourse, these are the solutions outside PostgreSQL.


On Tue, Dec 10, 2013 at 4:43 AM, Dmitry Koterov dmi...@koterov.ru wrote:

 Hello.

 Is there a way to compress the traffic between master and slave during the 
 replication?.. The streaming gzip would be quite efficient for that.

 (WAL archiving is not too good for this purpose because of high lag. I just 
 need to minimize the cross-datacenter traffic keeping the replication lag 
 low.)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication: GZIP compression in WAL sender/receiver processes communication?

2013-12-09 Thread Dmitry Koterov
Hello.

Is there a way to compress the traffic between master and slave during the
replication?.. The streaming gzip would be quite efficient for that.

(WAL archiving is not too good for this purpose because of high lag. I just
need to minimize the cross-datacenter traffic keeping the replication lag
low.)


Re: [GENERAL] Replication: GZIP compression in WAL sender/receiver processes communication?

2013-12-09 Thread Michael Paquier
On Tue, Dec 10, 2013 at 8:13 AM, Dmitry Koterov dmi...@koterov.ru wrote:
 Hello.

 Is there a way to compress the traffic between master and slave during the
 replication?.. The streaming gzip would be quite efficient for that.

 (WAL archiving is not too good for this purpose because of high lag. I just
 need to minimize the cross-datacenter traffic keeping the replication lag
 low.)
Not directly AFAIK, but this would be a good argument for the
implementation of hooks in walsender and walreceiver that could allow
to use such customization of the stream sent and received.

Note that there is also a patch pending for 9.4 that would make
possible the compression of full page writes, reducing globally the
amount of WAL produced by server in exchange of some CPU work to
compress and decompress the data.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication: GZIP compression in WAL sender/receiver processes communication?

2013-12-09 Thread John R Pierce

On 12/9/2013 3:13 PM, Dmitry Koterov wrote:
Is there a way to compress the traffic between master and slave during 
the replication?.. The streaming gzip would be quite efficient for that.


(WAL archiving is not too good for this purpose because of high lag. I 
just need to minimize the cross-datacenter traffic keeping the 
replication lag low.)


run it through a ssh tunnel that has compression turned on ?



--
john r pierce  37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication and fsync

2013-10-29 Thread Kevin Grittner
Tomas Vondra t...@fuzzy.cz wrote:

 I think it's safe as long as you don't try to reuse the cluster
 after a crash (be it due to OS error, power outage, ...). If the
 primary crashes for any reasons, you have to start from scratch,
 otherwise there might be silent corruption as you've described.

I agree.  It seems to me that as long as you never try to start the
old master after a crash of the OS (directly or because of hardware
or VM failure), the standby should be usable without risk of
corruption.  As soon as you start the old master after such a crash
though, you could be replicating corrupted blocks; you would need
to be very hard-line about never bringing the old master back up.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication by file syncing and data directory permissions

2013-10-28 Thread Yuri Khan

Hello All,

I am trying to automate setting up a hot standby slave replication on 
Ubuntu Precise, with PostgreSQL 9.1, following the Binary Replication 
Tutorial: http://wiki.postgresql.org/wiki/Binary_Replication_Tutorial .


(Distribution-specific conventions: postmaster runs as a dedicated 
system user named postgres; data directory is in 
/var/lib/postgresql/9.1/main; most config files are in 
/etc/postgresql/9.1/main.)


The current workflow is as follows:

* The server’s pg_hba.conf is set up to allow replication connections 
from a specific subnet.
* Administrator logs on to the to-be slave, installs a package that 
depends on postgresql-9.1, runs a shell script passing it the address of 
the server.

* The script’s main part is (error handling omitted for clarity):

===
service postgresql stop
ssh postgres@$MASTER \
psql -c \select pg_start_backup('clone', true);\
rsync -av --exclude pg_xlog --exclude postgresql.conf \
--exclude postmaster.pid \
-e ssh postgres@$MASTER $MASTER:$DATA_DIR/\* $DATA_DIR
ssh postgres@$MASTER \
psql -c \select pg_stop_backup();\
rsync -av -e ssh postgres@$MASTER $MASTER/$DATA_DIR/pg_xlog \
$DATA_DIR/
cat $DATA_DIR/recovery.conf -EOF
standby_mode = 'on'
primary_conninfo = 'host=$MASTER port=$PORT'
EOF
service postgresql start
===

This works, but leaves a bit to be desired:

* The script uses four SSH connections to the master server. In order to 
avoid multiple password requests, we use public key authentication.
* Two ssh connections that run the start/stop_backup queries could be 
replaced with normal psql connections as a special role granted the 
rights to execute pg_start_backup and pg_stop_backup.
* rsync invocations have to have read and execute (list files) access to 
the data directory on the master, and read access to individual files. 
Write access is generally not necessary. However, PostgreSQL refuses to 
start when the data directory is readable by anyone other than just 
postgres, so we have to have rsync connect as postgres.
* Thus, the slave has an ssh key allowing superuser access to the master 
database.


We would like to harden the access rights of the standby as tightly as 
practical, preferably to read-only on the data directory.


I am aware that it is possible to use pg_basebackup over a replication 
connection for the initial synchronization, but that way we won’t have 
resume capabilities if connection drops. Additionally, I’m not sure if 
--gzip applies to the network traffic on the wire or if pg_basebackup 
just zips up the resulting .tar on the receiving side.


What would be the best way to limit the access rights of the slave while 
retaining on-the-wire traffic compression and resumable data transfers?


--
senior developer, 2GIS Unix Team
jabber (gtalk, xmpp): yurivk...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication and fsync

2013-10-26 Thread Tomas Vondra
Hi,

On 24.10.2013 23:18, Alban Hertroys wrote:
 On Oct 24, 2013, at 18:10, maillis...@gmail.com wrote:
 
 Thank you for the answers. I'm still confused. If fsync is not
 replicated to the slave, then how is replication affected by a
 corrupt master? If the master dies and there's a commit recorded in
 the wal log that didn't actually happen, wouldn't the slave still
 be expected to be in a sane state, with the wal logs accurately
 reflecting what's on disk?
 
 Maybe I just don't understand streaming replication enough. The
 docs seem to say that synchronous commits mean that the slave also
 has to verify a write before a transaction is considered complete.
 How does fsync affect the way/order in which statements are sent to
 the slave for replication?
 
 What you're missing is that the master will be replicating corrupt
 data. That is, _if_ it gets corrupted of course. But, data corruption
 in a database has a tendency to go unnoticed for a while.

I think it's safe as long as you don't try to reuse the cluster after a
crash (be it due to OS error, power outage, ...). If the primary crashes
for any reasons, you have to start from scratch, otherwise there might
be silent corruption as you've described.

regards
Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


  1   2   3   4   5   6   >