Re: pg_basebackup + incremental base backups

2020-05-25 Thread Christopher Pereira



On 24-May-20 15:48, Stephen Frost wrote:

That really shouldn't be possible.  I'm very curious as to exactly what
happened that resulted in your primary/replica being 'out of sync', as
you say.


Hi Stephen,

Actually this was more a hypothetical question to find a solution in 
case some day one of our standby clusters goes out of sync and we have 
to rebuild it having a very big database.
With proper WAL archiving this shouldn't happen but we wanted to be 
prepared for this scenario just in case.


We did some tests measuring IO and traffic and are very happy with the 
results. We will definitely be adding pgBackRest to our toolchain.


Regarding my initial question, I still believe that the world deserves a 
simple direct pg_basebackup replacement even when putting an additional 
"repo host" in the middle is a better idea in the long term.


As you said, all the pieces are there and it would be quite easy to 
write a new "pg_basebackup_delta" script that could be executed on the 
standby host to:


1) setup a pgBackRest repo on the primary host (via SSH)

2) create a backup on the primary host (via SSH)

3) do a delta restore on the standby

Even when the repository on the primary host is only created temporarily 
(and require double storage, resources, etc), it may still be worth 
considering the traffic that can be saved by doing a delta restore on a 
standby host in a different region, right?


Thanks and congratulations for the good work.





Re: FDW and RLS

2020-05-25 Thread Charles Clavadetscher

Hello

On 2020-05-25 15:50, Laurenz Albe wrote:

On Fri, 2020-05-22 at 08:02 -0500, Ted Toth wrote:

Will RLS be applied to data being retrieved via a FDW?


ALTER FOREIGN TABLE rp_2019 ENABLE ROW LEVEL SECURITY;
ERROR:  "rp_2019" is not a table

Doesn't look good.

Yours,
Laurenz Albe


Actually it does work if you set the policy on the source table and 
access it using the user defined in the user mappings on the foreign 
table on the remote server.


Server 1:

charles@kofdb.archivedb.5432=# \d public.test_fdw_rls
 Table "public.test_fdw_rls"
  Column  |  Type   | Collation | Nullable | Default
--+-+---+--+-
 id   | integer |   |  |
 content  | text|   |  |
 username | text|   |  |
Policies:
POLICY "kofadmin_select" FOR SELECT
  TO kofadmin
  USING ((username = ("current_user"())::text))

kofadmin@kofdb.archivedb.5432=> \dp public.test_fdw_rls
 Access privileges
 Schema | Name | Type  |Access privileges| Column 
privileges |   Policies

+--+---+-+---+--
 public | test_fdw_rls | table | charles=arwdDxt/charles+|   
| kofadmin_select (r):+
|  |   | kofadmin=arwd/charles   |   
|   (u): (username = ("current_user"())::text)+
|  |   | |   
|   to: kofadmin


charles@kofdb.archivedb.5432=# SELECT CURRENT_USER; SELECT * FROM 
public.test_fdw_rls;


charles@kofdb.archivedb.5432=# SELECT CURRENT_USER; SELECT * FROM 
public.test_fdw_rls;

 current_user
--
 charles
(1 row)

 id | content  | username
+--+--
  1 | Text for charles | charles
  1 | Access from fdw via user fdwsync | fdwsync
(2 rows)

charles@kofdb.archivedb.5432=# set role fdwsync ;
SET
charles@kofdb.archivedb.5432=> SELECT CURRENT_USER; SELECT * FROM 
public.test_fdw_rls;

 current_user
--
 fdwsync
(1 row)

 id | content  | username
+--+--
  1 | Access from fdw via user fdwsync | fdwsync
(1 row)

On the server accessing the table via FDW:

kofadmin@kofdb.t-archivedb.5432=> \deu+
List of user mappings
   Server   | User name | FDW options
+---+-
 kofdb_prod | kofadmin  | (password 'mysecret', "user" 'fdwsync')

kofadmin@kofdb.t-archivedb.5432=> SELECT CURRENT_USER; SELECT * FROM 
public.test_fdw_rls ;

 current_user
--
 kofadmin
(1 row)

 id | content  | username
+--+--
  1 | Access from fdw via user fdwsync | fdwsync
(1 row)

Regards
Charles




Re: pg_dump crashes

2020-05-25 Thread Adrian Klaver

On 5/24/20 10:30 PM, Nico De Ranter wrote:
Unfortunately not. I discovered the issue rather late. The last working 
backup is about 2 months old.


Well first it is entirely possible this is not the only corruption in 
the database.


Second you are probably going to have to reach out to the Bacula folks:

https://www.bacula.org/support/
https://sourceforge.net/projects/bacula/lists/bacula-users

I would say the questions to ask them are:

1) Is the md5 required for a file?

2) If so how and over what is calculated?


Then you could experiment with trying to update the md5 field with new 
data.



 >
 >
 >
 > Following up on the max(bigint), I tried
 >
 >     SELECT md5 FROM public.file where fileid >2087994666;
 >
 > and got
 >
 >     ERROR:  compressed data is corrupted
 >
 > So it does look like those entries are killing it.  Now for the
 > million-dollar question: how do I get them out?

Do you have recent previous backup?

 >
 > Nico
 >
 > --
 >
 > Nico De Ranter
 >
 > Operations Engineer
 >



-- 
Adrian Klaver

adrian.kla...@aklaver.com 



--

Nico De Ranter

Operations Engineer





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: FDW and RLS

2020-05-25 Thread Laurenz Albe
On Fri, 2020-05-22 at 08:02 -0500, Ted Toth wrote:
> Will RLS be applied to data being retrieved via a FDW?

ALTER FOREIGN TABLE rp_2019 ENABLE ROW LEVEL SECURITY;
ERROR:  "rp_2019" is not a table

Doesn't look good.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Logical replication troubles

2020-05-25 Thread Peter Eisentraut

On 2020-05-25 10:19, Anders Bøgh Bruun wrote:
Thank you for that clarification. It helps me understand how things work 
a lot better.
I know this might be a bit off topic, but my challenge here is that we 
are using Patroni (by using Zalando's postgres-operator for Kubernetes), 
and any replication slot not created by Patroni, seems to be removed, 
whenever the master pod restarts. We therefore specify in the Patroni 
config, that a permanent replication slot should be created for our 
usage to do logical replication of some select tables, to our data 
warehouse. That means that the replication slot is created as soon as 
the database is ready to use, which is also before any tables, data or 
publications are created. Can you give me a hint as to what the correct 
way to set this up would be?

Or do I need to try contacting the Patroni devs instead?


That would probably be best.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Logical replication troubles

2020-05-25 Thread Anders Bøgh Bruun
Thank you for that clarification. It helps me understand how things work a
lot better.
I know this might be a bit off topic, but my challenge here is that we are
using Patroni (by using Zalando's postgres-operator for Kubernetes), and
any replication slot not created by Patroni, seems to be removed, whenever
the master pod restarts. We therefore specify in the Patroni config, that a
permanent replication slot should be created for our usage to do logical
replication of some select tables, to our data warehouse. That means that
the replication slot is created as soon as the database is ready to use,
which is also before any tables, data or publications are created. Can you
give me a hint as to what the correct way to set this up would be?
Or do I need to try contacting the Patroni devs instead?

On Fri, 22 May 2020 at 11:00, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2020-05-20 17:16, Anders Bøgh Bruun wrote:
> > [67] LOG:  logical replication apply worker for subscription
> > "widgets_sub" has started
> > [67] DEBUG:  connecting to publisher using connection string
> > "dbname=testdb host=master port=5432 user=repuser password=abc123"
> > [67] ERROR:  could not receive data from WAL stream: ERROR:  publication
> > "widgets_pub" does not exist
> >CONTEXT:  slot "my_slot", output plugin "pgoutput", in the change
> > callback, associated LSN 0/1674958
> > [1] DEBUG:  unregistering background worker "logical replication worker
> > for subscription 16396"
> > [1] LOG:  background worker "logical replication worker" (PID 67) exited
> > with exit code 1
> >
> > I can verify that the publication called widgets_pub does exist, and I
> > am not seeing any errors on the sending side.
> >
> > The SQL-file named "works" just has the creation of the replication slot
> > moved down to after I insert some data into the table I want to
> > replicate. And that works as expected.
>
> You need to create the publication before the replication slot.  The
> walsender's view of the world moves along with the WAL it is
> decoding/sending.  So when the subscription worker connects, it
> initially sees a state as of the creation of the replication slot, when
> the publication did not exist yet.
>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


-- 
Anders Bøgh Bruun

Infrastructure Architect

CellPoint digital
cellpointdigital.com
*WE MAKE TRAVEL EASIER™*

M: +45 31 14 87 41
E: and...@cellpointdigital.com

Chicago | *Copenhagen* | Dubai | London | Miami | Pune | Singapore


Re: Query returns no rows in pg_basebackup cluster

2020-05-25 Thread Andrus

Hi!


How to set logical replication for all user databases in cluster so that
when new database is added or new tables are added to database they will
start replicate automatically ?

I think that it would be good if you spend some time reading the
documentation on this stuff, particularly the part about restrictions,
to understand the use cases where that can become useful:
https://www.postgresql.org/docs/devel/logical-replication.html


Thank you.
I read it and havent found any reference to PITR recovery.
For PITR recovery it should probably save sql statements to files and allow to 
specify recovery target time
for applying sql statements to base backup.

Is PITR recovery supported only using binary WAL files ?

Other limits can probably be solved.

Andrus.




Re: Request to help on Query improvement suggestion.

2020-05-25 Thread Laurenz Albe
On Fri, 2020-05-22 at 16:15 +0530, devchef2020 d wrote:
> PostgreSQL : 9.5.15

> Created Indexes on column parental_path:
> =
> 
> CREATE INDEX cable_pair_parental_path_idx
>   ON SCHEMA.TABLE_NAME
>   USING btree
>   (md5(parental_path) COLLATE pg_catalog."default");
>   
> CREATE INDEX cable_pair_parental_path_idx_fulltext
>   ON SCHEMA.TABLE_NAME
>   USING gist
>   (parental_path COLLATE pg_catalog."default");

> SELECT seq_no + 1 FROM SCHEMA.TABLE_NAME WHERE (parental_path LIKE 
> '%,sheath--' || cable_seq_id || ',%' OR parental_path LIKE 'sheath--' || 
> cable_seq_id || ',%' OR parental_path LIKE '%,sheath--' ||
> cable_seq_id OR parental_path = 'sheath--' || cable_seq_id) ORDER BY seq_no 
> DESC LIMIT 1;
> 
> Explain Plan:
> =
> 
> Limit  (cost=108111.60..108111.61 rows=1 width=4) (actual 
> time=4597.605..4597.605 rows=0 loops=1)
>  Output: ((seq_no + 1)), seq_no
>  Buffers: shared hit=2967 read=69606 dirtied=1
>  ->  Sort  (cost=108111.60..108113.09 rows=595 width=4) (actual 
> time=4597.603..4597.603 rows=0 loops=1)
>Output: ((seq_no + 1)), seq_no
>Sort Key: TABLE_NAME.seq_no DESC
>Sort Method: quicksort  Memory: 25kB
>Buffers: shared hit=2967 read=69606 dirtied=1
>->  Seq Scan on SCHEMA.TABLE_NAME  (cost=0.00..108108.63 rows=595 
> width=4) (actual time=4597.595..4597.595 rows=0 loops=1)
>  Output: (seq_no + 1), seq_no
>  Filter: ((TABLE_NAME.parental_path ~~ '%,sheath--64690,%'::text) 
> OR (TABLE_NAME.parental_path ~~ 'sheath--64690,%'::text) OR 
> (TABLE_NAME.parental_path ~~ '%,sheath--64690'::text) OR
> (TABLE_NAME.parental_path = 'sheath--64690'::text))
>  Rows Removed by Filter: 1930188
>  Buffers: shared hit=2967 read=69606 dirtied=1

An index on an expression can only be used if the expression is exactly the 
same as on one
side of an operator in a WHERE condition.

So your only chance with that query is to hope for a bitmap OR with an index on 
"parental path".

Two things to try:

1)  CREATE INDEX ON table_name (parental_path text_pattern_ops);

2)  CREATE EXTENSION pg_trgm;
CREATE INDEX ON table_name USING GIN (parental_path gin_trgm_ops);

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Query returns no rows in pg_basebackup cluster

2020-05-25 Thread David G. Johnston
On Sunday, May 24, 2020, Andrus  wrote:

> Hi!
>
> Backup in created in Windows from Linux server using pg_receivewal and
>>> pg_basebackup .
>>> Can this backup used for PITR in Linux ?
>>>
>> No.  Physical copies need to be based on the same platform.  If you
>> wish to replicate a cluster without any platform, architecture or even
>> not-too-many major version constraints, there is also logical
>> replication available since v10.
>>
>
> Will logical replication also allow two modes:
>  1. PITR recovery can used if needed
>  2. Hot standby: User databases in both clusters contain same data.
>
>
Why are you spending so much effort on this Window/Linux hybrid setup?  Get
yourself another Linux server and setup physical replication.  It sounds
like it will exactly meet your requirements and you will waste more time
and money working out alternatives than the server would cost.

David J.


Re: Query returns no rows in pg_basebackup cluster

2020-05-25 Thread Michael Paquier
On Mon, May 25, 2020 at 09:02:49AM +0300, Andrus wrote:
> Will logical replication also allow two modes:
>  1. PITR recovery can used if needed
>  2. Hot standby: User databases in both clusters contain same data.
> 
> How to set logical replication for all user databases in cluster so that
> when new database is added or new tables are added to database they will
> start replicate automatically ?

I think that it would be good if you spend some time reading the
documentation on this stuff, particularly the part about restrictions,
to understand the use cases where that can become useful:
https://www.postgresql.org/docs/devel/logical-replication.html
--
Michael


signature.asc
Description: PGP signature


Re: Query returns no rows in pg_basebackup cluster

2020-05-25 Thread Andrus

Hi!


No.  Physical copies need to be based on the same platform.

Does the O/S that the client software runs on really affect this?

To the extent that the O/S determines text sort order, yes; see thread.
The short answer here is that we aren't going to support such cases.
If you try to replicate across platforms, and it works, you're in luck.
If it doesn't work, you get to keep both pieces; we will not accept
that as a bug.


In 2017 Peter wrote that  ICU-based collations will offered alongside the 
libc-based collations (1)
Currently it still requires re-compilation of Postgres for all binary 
replication platforms.
Maybe ICU locale will selected during installation automatically in Postgres 13 . Using same ICU locale in all replication platforms 
will hopefully fix the issue.


Currently option is to use ucs_basic as default collation when creating cluster.

(1) https://www.2ndquadrant.com/en/blog/icu-support-postgresql-10/

Andrus. 






Re: Query returns no rows in pg_basebackup cluster

2020-05-25 Thread Andrus

Hi!


Backup in created in Windows from Linux server using pg_receivewal and 
pg_basebackup .
Can this backup used for PITR in Linux ?

No.  Physical copies need to be based on the same platform.  If you
wish to replicate a cluster without any platform, architecture or even
not-too-many major version constraints, there is also logical
replication available since v10.


Will logical replication also allow two modes:
 1. PITR recovery can used if needed
 2. Hot standby: User databases in both clusters contain same data.

How to set logical replication for all user databases in cluster so that when new database is added or new tables are added to 
database they will start replicate automatically ?

Will it require more powerful backup server to replay main server sql stream 
from different databases.

Andrus.