Re: [GENERAL] postmaster deadlock while logging after syslogger exited

2017-11-16 Thread Michael Paquier
On Fri, Nov 17, 2017 at 11:14 AM, Andres Freund <and...@anarazel.de> wrote:
> On 2017-11-17 11:09:56 +0900, Michael Paquier wrote:
>> when redirection_done is switched to true because the first process
>> generating a message to the syslogger pipe needs to open it first if
>> not done yet?
>
> I can't follow. The syslogger pipe is created when the first syslogger
> is started (before it's forked!). Which happens before other processes
> are created, because they all need to inherit that file descriptor.

Ah, OK. I didn't recall this dependency. Sorry for the confusion.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postmaster deadlock while logging after syslogger exited

2017-11-16 Thread Michael Paquier
On Fri, Nov 17, 2017 at 10:50 AM, Andres Freund  wrote:
> On 2017-11-06 15:35:03 -0500, Tom Lane wrote:
>> David Pacheco  writes:
>> > I ran into what appears to be a deadlock in the logging subsystem.  It
>> > looks like what happened was that the syslogger process exited because it
>> > ran out of memory.  But before the postmaster got a chance to handle the
>> > SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker.
>> > That also failed, and the postmaster went to log a message about it, but
>> > it's blocked on the pipe that's normally connected to the syslogger,
>> > presumably because the pipe is full because the syslogger is gone and
>> > hasn't read from it.
>>
>> Ugh.
>
> I'm somewhat inclined to say that one has to live with this if the
> system is so resource constrainted that processes barely using memory
> get killed.
>
> We could work around a situation like that if we made postmaster use a
> *different* pipe as stderr than the one we're handing to normal
> backends. If postmaster created a new pipe and closed the read end
> whenever forking a syslogger, we should get EPIPEs when writing after
> syslogger died and could fall back to proper stderr or such.

I don't have the code on top of my mind, but isn't a custom fd causing
a small penalty when redirection_done is switched to true because the
first process generating a message to the syslogger pipe needs to open
it first if not done yet? So you'd need proper locking to save from
race conditions. Or is the first message redirected message always
generated by the postmaster or the syslogger? I don't recall that this
is actually true..
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL walsender process doesn't exist after "pg_ctl stop -m fast"

2017-11-13 Thread Michael Paquier
On Tue, Nov 14, 2017 at 12:05 AM, Tom Lane  wrote:
> y39chen  writes:
>> We encounter one problem that PostgreSQL walsender process doesn't exist
>> after "pg_ctl stop -m fast".
>> Uses PostgreSQL 9.6.2
>
> There was a fix in 9.6.4 that's at least related to this problem.
> It would be interesting to see if you can still reproduce it on
> current 9.6.

Commit that may matter here:
commit: e9d4aa594f2caa8c28d55c41c9926420b1efdb79
author: Tom Lane 
date: Fri, 30 Jun 2017 12:00:03 -0400
Fix walsender to exit promptly if client requests shutdown.

It's possible for WalSndWaitForWal to be asked to wait for WAL that doesn't
exist yet.  That's fine, in fact it's the normal situation if we're caught
up; but when the client requests shutdown we should not keep waiting.
The previous coding could wait indefinitely if the source server was idle.

In passing, improve the rather weak comments in this area, and slightly
rearrange some related code for better readability.

Back-patch to 9.4 where this code was introduced.

Discussion: https://postgr.es/m/14154.1498781...@sss.pgh.pa.us
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] archive_command not being executed

2017-11-13 Thread Michael Paquier
On Tue, Nov 14, 2017 at 8:56 AM, Jeremy Schneider
 wrote:
> From my reading of the docs and commit logs, standby databases
> couldn't archive their WALs until 9.5.

pg_receivexlog is available in 9.3. You could leverage your archives
with it easily, by for example connecting it to a standby you'd like
to get the archives from.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql and github

2017-11-09 Thread Michael Paquier
On Thu, Nov 9, 2017 at 8:27 PM, Poul Kristensen  wrote:
> How come that Postgresql is not present in a github with latest release?
>
> It would be very convenient to deploy PG using Ansible.
>
> Oracle(latest release) is available through github.

You are looking for that, which is a mirror of the official repository
in git.postgresql.org:
https://github.com/postgres/postgres
There is as well a section with release tarballs (so do the facilities
offered by community by the way).
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Naming conventions for column names

2017-11-07 Thread Michael Paquier
On Wed, Nov 8, 2017 at 12:41 AM, Alvaro Herrera  wrote:
> Sachin Kotwal wrote:
>> 3. Notify or highlight these changes in release notes because this can
>> break some existing tools and user code.
>
> Notifying people when their tools no longer work with a new server is
> not the problem; they figure that out pretty quickly once they try the
> new version.  The problem is that if you change any names, the
> application developers need to provide version-specific queries that
> work across all the PG versions they want to support.  That leads to
> some pretty horrible code, annoyed developers, bad publicity for Pg
> ("you guys don't love your app developers!"), etc.

This reminds me of pg_stat_activity whose pid column has been renamed,
and this column is widely used for monitoring... My experience on that
is that renaming induces technical debts, but applications would see
queries failing immediately, which is not complicated to grep and fix
properly as frontend code usually has already logic to track the
backend version number, and even if any deeper modification is needed
things like libpq allow to fetch the backend version easily. Column
name of function name changes happen (see xlog -> wal switch in v10),
and community is usually driven by consensus. So if there is a
proposal adopted by a majority of hackers thinking that changing a
column is worth long-term, then it could be considered for
integration. I personally tend to take with a pinch of salt such
proposals though if there are no good reasons behind a switch other
than because it-is-beautiful, so I agree with Álvaro that it is good
to be careful here.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Fwd: standby stop replicating, then picked back up

2017-11-07 Thread Michael Paquier
On Wed, Nov 8, 2017 at 5:17 AM, Laurenz Albe  wrote:
> chris kim wrote:
>> I had a standby hang for a while, not replicating, but then it fixed
>> itself but I'm not sure why it happened in the first place. What would I
>> look into to see why this happened, or any insight into why is greatly
>> appreciated.
>
> You give us precious little information.
>
> If there is nothing suspicious in the log, and hot standby is enabled,
> and the standby is configured appropriately, it could be that a conflicting
> query on the standby block WAL application for a while.

I am understanding here the following: if a standby is stopped for a
long time, would it be able to catch up automatically? This is mainly
a matter of WAL segments recycled on the primary (or a standby for
cascading streaming). In short, when the primary completes two
checkpoints, it recycles or renames past WAL segments in pg_xlog that
it does not need for recovery because it is able to recover to a
consistent state.

If the standby uses a replication slot for recovery, then you could
allow a standby to plug in back as long as the primary's pg_xlog does
not get bloated too much, at which point the partition where pg_xlog
is located would cause the primary to go down because of space
exhaustion. Using a WAL archive can be worthy if standbys are taken
down for a long time though, with a proper recovery command, or a WAL
segment range copy, you could allow a standby to recover from an
earlier point. Strategies to adopt mainly depend on if taking a full
backup is more costly than a range of WAL segments, so the data folder
size of the primary instance matters as a decision-making parameter.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [HACKERS] SSL and Encryption

2017-11-02 Thread Michael Paquier
On Fri, Nov 3, 2017 at 3:19 AM, Craig Ringer  wrote:
> This is probably off topic for pgsql-hackers.
>
> For password crypto please go read the SCRAM thread and the PostgreSQL
> 10 release notes.

The SCRAM discussion is spread across two threads mainly with hundreds
of emails, which may discourage even the bravest. Here are links to
the important documentation:
https://www.postgresql.org/docs/current/static/auth-methods.html#auth-password
https://www.postgresql.org/docs/10/static/sasl-authentication.html

And PostgreSQL implements SCRAM-SHA-256 following RFCs 7677 and 5802:
https://tools.ietf.org/html/rfc5802
https://tools.ietf.org/html/rfc7677
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG 10 - Trouble with overlap of range partition of two dimensions

2017-10-31 Thread Michael Paquier
On Tue, Oct 31, 2017 at 8:36 PM, Stephen Froehlich
 wrote:
> CREATE TABLE lotsa_data_20171027_src3 PARTITION OF lotsa_data
> FOR VALUES FROM ('2017-10-26 18:00:00-06', 3) TO ('2017-10-27
> 17:59:59.999-06', 3);
> ERROR: partition " lotsa_data_20171027_src1" would overlap partition
> "lotsa_data_20171027_src3"
>
> Why am I getting this error?

The answer is in the documentation:
https://www.postgresql.org/docs/devel/static/sql-createtable.html
The rules of such partitions is bound to row-wise comparisons. "For
example, given PARTITION BY RANGE (x,y), a partition bound FROM (1, 2)
TO (3, 4) allows x=1 with any y>=2, x=2 with any non-null y, and x=3
with any y<4."

So in your example and by comparing with the docs, the first partition
allows start_time = '2017-10-26 18:00:00-06' with source_no >= 1, and
the second partition allows source_no >= 3 which overlaps with the
first one.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Backup strategy using 'wal_keep_segments'

2017-10-31 Thread Michael Paquier
On Tue, Oct 31, 2017 at 9:53 AM, Rhhh Lin  wrote:
> I would actually be an advocate for using a proper archive_command in order
> to facilitate a proper (Per the documentation) PITR and backup strategy.

You should avoid using your own fancy archive command. There are
things that WAL-E for this purpose, and even if you do make sure that
the segment data is flushed to disk before sending back 0 as exit code
to the client.

> However, a colleague had suggested such a creative approach (Possibly, less
> administrative overhead, ease of maintenance? I'm not sure) and I struggled
> to find any evidence online in blogs/white-papers/documentation that this
> was a feasible approach. That said, I couldn't find any info rejecting it as
> a method either, which led me to post here. Essentially, this was a
> difference of opinion on approach, and I was looking to gather information
> in order to make an informed opposing argument.

Backup methods should not be fancy, just reliable.

> My only thought now would be how could I decipher, within the sequence chain
> of WAL files, up to which file has the "archival" progressed to. i.e. which
> files are not susceptible to being called upon again for
> restartpoints/checkpoints. That is, where is my absolute point (or file) of
> archival using something along the lines of 'pg_current_xlog_location'.

There is the system view called pg_stat_archiver for that. If you are
using something like pg_receivexlog, pg_stat_replication is the way to
go to monitor the archiving progress.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] starting PG command line options vs postgresql.con

2017-10-30 Thread Michael Paquier
On Mon, Oct 30, 2017 at 2:08 PM, David G. Johnston
 wrote:
> On Mon, Oct 30, 2017 at 6:48 AM, rakeshkumar464 
> wrote:
>>
>> I would prefer using postgresql.conf.  what is the consensus in this forum
>> regarding command line vs postgresql.conf.
>
> I suspect that most people administering a PostgreSQL database would expect
> that the configuration file would be changed in lieu of passing options via
> the command line.

Disagreement here. For one, it makes pg_upgrade more complicated
because it would need to track and then rewrite postgresql.conf, or
just copy it temporarily. The current way of doing things gives the
best of both worlds.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can't build ODBC -- odbc_config not found

2017-10-27 Thread Michael Paquier
On Thu, Oct 26, 2017 at 11:40 PM, Chris Albertson
 wrote:
> I just tried to build Postgres ODBC from source (psqlodbc-10.00..tar.gz)
>
> I type "./configure"
>
> Then get this message:
> configure: error: odbc_config not found (required for unixODBC build)
>
> So it must be looking for a file called "odic_config"?  It's not 100%
> clear what is needed.

Are you sure that you installed unixodbc? Normally an install of this
package, be it on Linux or macos should install this command as well.
You can enforce a path to it using ODBC_CONFIG when launching
./configure.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Allow only certain query on replication slave

2017-10-24 Thread Michael Paquier
On Tue, Oct 24, 2017 at 6:12 AM, basti  wrote:
> I have a Postgres slave (wal replication) and want no query on it, expect
>
> - SELECT pg_last_xlog_receive_location()
> - SELECT pg_last_xlog_replay_location()
>
> When I set hot_standby = off in postgres.conf nobody can query the
> replication cluster.
>
> Is there a way to allow only the above query and deny all other?

There is a trick you could use here with two hooks: the planner hook
and the utility hook. The idea is to filter all queries and refuse
them when RecoveryInProgress() is satisfied. The parsed tree can be
used in the planner hook to check for those functions and accept them.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres 9.6 fails to start on VMWare

2017-10-22 Thread Michael Paquier
On Sun, Oct 22, 2017 at 11:13 PM, Martin Moore  wrote:
> I’ve migrated a running Debian Jessie system from a Google Compute instance 
> to a VMWare ESXi 6.5 system.

How did you actually do this migration? It is really easy to finish
with a corrupted instance if not doing things correctly in this world
(I am referring to quiesced snapshot & co).
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Backup strategy using 'wal_keep_segments'

2017-10-22 Thread Michael Paquier
On Mon, Oct 23, 2017 at 5:57 AM, Rhhh Lin  wrote:
> Is this approach feasible? Assuming obviously, we have sufficient disk space
> to facilitate 1000 WAL files etc.

You expose yourself to race conditions with such methods if a
checkpoint has the bad idea to recycle past segments that your logic
is copying. So I would advise to not do that. Instead of using the
archive command, you should also consider using pg_receivexlog
combined with a replication slot. This brings way more control with
the error handling.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Weird performance difference

2017-10-22 Thread Michael Paquier
On Sat, Oct 21, 2017 at 9:38 AM, Tom Lane  wrote:
> Also try explicitly ANALYZE'ing the foreign tables.  I do not
> believe auto-analyze will touch foreign tables ...

Autovacuum and autoanalyze only process relations and matviews,
discarding the rest when scanning pg_class. See do_autovacuum().
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] could not fdatasync log file: Input/output error

2017-10-17 Thread Michael Paquier
On Wed, Oct 18, 2017 at 8:02 AM, said assemlal  wrote:
> Thanks for your response.
>
> We are currently running postgresql-9.4.14
> I see there are some tools to check if the indexes/pages are not corrupted.
> But is there a faster way to check if a PGDATA instance is clean ?

Yes, there is something you could try. Peter Geoghegan (actually a
colleague) has been working on amcheck, which is aimed at checking
corrupted heap and btree pages so as you can detect corruptions at
https://github.com/petergeoghegan/amcheck. The master branch is able
to perform checks only on btree indexes (integrated in PG 10), and
Peter has been playing lately with heap checks in the branch
heap-check. The utility can be built using Postgres 9.4 and is
non-intrusive. I have not tested that much myself yet, but you could
run amcheck on this instance, *after* of course taking a cold copy of
the data folder and starting it on a safer host that you think has
non-busted disks.

Note that Peter has also worked on provising Debian packages for the
utility down to 9.4 if I recall correctly, which is nice, but if you
want the heap checks you will need to compile things by youself. We
are currently under way to get something improved in Postgres 11. I
should actually spare some time to look more at the patch concepts..
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] wal_retrieve_retry_interval

2017-10-16 Thread Michael Paquier
On Mon, May 29, 2017 at 3:58 PM, Ludovic Vaugeois-Pepin
 wrote:
> < 2017-05-29 22:42:17.026 CEST > DEBUG:  switched WAL source from
> archive to stream after failure
> ... (15 seconds later) ...
> < 2017-05-29 22:42:32.042 CEST > DEBUG:  switched WAL source from
> stream to archive after failure

Ludovic, is there some event happening between those two logs?
Something that could explain a longer delay is the time it takes for a
WAL receiver to be considered as started (see WALRCV_STARTUP_TIMEOUT).
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] could not fdatasync log file: Input/output error

2017-10-16 Thread Michael Paquier
On Mon, Oct 16, 2017 at 11:47 PM, said assemlal  wrote:
> Just before we restart the server today, I found only one line as:
>
> PANIC:  could not fdatasync log file 000101760083: Input/output
> error
> the database system is in recovery mode

Ouch. I would not trust this host at this point, this looks like a
file system or a disk issue. Before doing anything you should stop the
database, and make a cold copy of the data folder on which you could
work on if you don't have a live backup. This wiki page is wise on the
matter:
http://wiki.postgresql.org/wiki/Corruption
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using cp to back up a database?

2017-10-09 Thread Michael Paquier
On Mon, Oct 9, 2017 at 11:09 PM, Achilleas Mantzios
 wrote:
> On 09/10/2017 16:51, Larry Rosenman wrote:
>
> If you want a consistent database (you *REALLY* do), pg_dump is the correct
> tool.
>
> In all pg conferences I have been, ppl scream : do not use pg_dump for
> backups :)

It depends on what you are trying to achieve, pg_dump can be fine for
small-ish databases. By relying on both logical (pg_dump) and physical
backups (base backups) brings more insurance in face of a disaster.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regarding scram authentication libpq version 10 or more

2017-10-06 Thread Michael Paquier
On Fri, Oct 6, 2017 at 8:56 PM, Durgamahesh Manne
 wrote:
> i have already read complete info about New Postgres 10 in postgres.org  and
> i have tried to access server  by adding the
> scram-sha-256 authentication in hba.conf and also have already set
> password_encyption = scram-sha-256 in config then reloaded the server for
> multiple times
>
> still i am getting same error i given you earlier while i check to access pg
> server with required credentials
>
> psql.bin: SCRAM authentication requires libpq version 10 or above

No idea. You may be using multiple versions of PostgreSQL in parallel,
and the client you are using may not be the client you think it is. I
suggest that you check the infrastructure of your host as well as the
package repository you are using. Good luck.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regarding scram authentication libpq version 10 or more

2017-10-06 Thread Michael Paquier
On Fri, Oct 6, 2017 at 5:44 PM, Durgamahesh Manne
 wrote:
> This is regarding scram authentication libpq version 10. From which site  i
> can download to configure libpq in order to use scram authentication in
> postgres 10
>
> i got below error while i check to access pg server
>
> psql.bin: SCRAM authentication requires libpq version 10 or more
>
> please let me know the detail info about scram libpq

Why not beginning to look at the download page of postgresql.org? Here
is the page:
https://www.postgresql.org/download/
SCRAM/SASL support in libpq is provided in any build of PostgreSQL 10,
so this depends mainly on the distribution and environment you are
using.

In order to get support for SCRAM, the documentation provides enough
details, beginning here:
https://www.postgresql.org/docs/10/static/auth-methods.html#auth-password
Basically you need to consider using password_encryption =
'scram-sha-256', and configure pg_hba.conf with a correct entry. If
you are upgrading from an existing instance, you need to make sure
that users with passwords are updated with proper SCRAM-hashed
entries.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_rewind copy so much data

2017-09-29 Thread Michael Paquier
On Fri, Sep 29, 2017 at 6:22 PM, Hung Phan  wrote:
> Thanks for your help. Do you have any more ideas about my case? I cannot
> find any reason for that. If pg_rewind just examines WAL for changed blocks
> from the common checkpoint, why did it copy all data ? As I understand, the
> information in WAL about changes cannot cover all data in the database.

The only thing I have in mind able to create this much amount of data
using this less WAL is a CREATE DATABASE using as template an existing
database. Based on the information you are giving here this is the
best guess I can do.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_rewind copy so much data

2017-09-28 Thread Michael Paquier
On Fri, Sep 29, 2017 at 1:06 PM, Hung Phan <hungphan...@gmail.com> wrote:
> I used tablespace to store data and it seems to be that pg_rewind copied
> everthing in the tablespace. Today I found an article posted by you (Michael
> Paquier) and you said that there was no tablespace support. If so, is there
> anyway to work around ?

I guess you mean that article:
http://paquier.xyz/postgresql-2/postgres-module-highlight-pg_rewind-to-recycle-a-postgres-master-into-a-slave/
This is from 2013 and this refers to the first iterations of the tool.
Tablespaces are now fully supported in the upstream version, as well
as in the version on github for 9.3 and 9.4. See this commit which
added support for tablespaces:
https://github.com/vmware/pg_rewind/commit/19fb09cdcac397048f7d723c037fe6a10299a278

FWIW, I use it daily, and pg_rewind is able to detect correctly page
deltas even on non-default tablespaces. Really.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Logical decoding client has the power to crash the server

2017-09-21 Thread Michael Paquier
On Fri, Sep 22, 2017 at 5:44 AM, Igor Neyman  wrote:
> I think the difference between pg_current_wal_lsn() and confirmed_flush_lsn 
> form pg_catalog.pg_replication_slots for specific replication slot:
>
> SELECT (pg_current_wal_lsn() - confirmed_flush_lsn) AS lsn_distance
>FROM pg_catalog.pg_replication_slots
>WHERE slot_name = '';
>
>  provides a measure in Logical Replication environment of how far did (or did 
> not) Subscriber fell behind Publisher, and hence some kind of measure of how 
> much "extra" WALs is stored on the Publisher.

More or less. I also make use of the value of max_wal_size in
pg_settings to do the decision-making, value set depending on a
fraction of the size of the partition dedicated to pg_xlog. By
applying pg_size_bytes() which is new to 9.6 on top of the value
fetched from pg_settings you can make a direct comparison and decide
if a slot can be dropped or not. Make sure that things are casted to
bigint though.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 10 beta 4 foreign table partition check constraint broken?

2017-09-21 Thread Michael Paquier
On Fri, Sep 15, 2017 at 10:43 PM, Paul Jones  wrote:
> Is this a bug in Postgres 10b4?  Looks like neither partition ranges
> nor check constraints are honored in 10b4 when inserting into
> partitions that are foreign tables.

Here is what you are looking for in the documentation:
https://www.postgresql.org/docs/10/static/sql-createforeigntable.html
Constraints on foreign tables (such as CHECK or NOT NULL clauses) are
not enforced by the core PostgreSQL system, and most foreign data
wrappers do not attempt to enforce them either; that is, the
constraint is simply assumed to hold true. There would be little point
in such enforcement since it would only apply to rows inserted or
updated via the foreign table, and not to rows modified by other
means, such as directly on the remote server. Instead, a constraint
attached to a foreign table should represent a constraint that is
being enforced by the remote server.

> Here is a nearly shovel-ready example.  Just replace with your
> servers/passwords.
>
> -- --
> -- Server 2
> -- --
>
> CREATE DATABASE cluster;
> \c cluster
>
> CREATE TABLE foo_1 (
> id  INT NOT NULL,
> nameTEXT
> );

So here I think that you should add a CHECK constraint to this table,
and that the behavior of your example works as expected.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Logical decoding client has the power to crash the server

2017-09-20 Thread Michael Paquier
On Thu, Sep 21, 2017 at 1:09 PM, Meel Velliste  wrote:
> In this situation, neither us, nor our customer has the power to install the
> required monitoring of pg_xlog. The database hosting provider would have to
> do it. In most cases (e.g. Amazon RDS) the hosting provider does provide a
> way of monitoring overall disk usage, which may be good enough. But I am
> thinking it would make sense for postgres to have default, built-in
> monitoring that drops all the slots when pg_xlog gets too full (based on
> some configurable limit). Otherwise everybody has to build their own
> monitoring and I imagine 99% of them would want the same behavior. Nobody
> wants their database to fail just because some client was not reading the
> slot.

(Please avoid top-posting, this breaks the logic of the thread and
this is contrary to the practices of the Postgres mailing lists)

Note that on-disk lookup is not strictly necessary. If you know
max_wal_size, pg_current_wal_lsn (or pg_last_wal_receive_lsn if
working on a standby) and the restart_lsn of the slots that's enough.
If you don't have privileges sufficient to see that, well I guess that
you will need to review the access permissions to your instance.
Postgres 9.6 offers better access control to system functions, so you
could be granted access to just those resources to be fine using a SQL
session.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] VM-Ware Backup of VM safe?

2017-09-20 Thread Michael Paquier
On Wed, Sep 20, 2017 at 5:45 PM, Albe Laurenz  wrote:
> Thomas Güttler wrote:
>> We run a PostgreSQL 9.6 server in a virtual machine.
>>
>> The virtual machine is managed by the customer.
>>
>> He does backup the VM.
>>
>> Is this enough, is this safe?
>
> I don't know about VMware,

I heard about one or two things :)

> but the general rule is that
> if the backup is truly atomic (it is guaranteed to capture
> a consistent state of the file system), you can use it
> to backup the database.

There are two types of snapshots: quiesced and non-quiesced. You
definitely want a quiesced snapshot when taking a backup so as the
system gets into a consistent state when working on it. There should
be an option related to that on the vSphere client managing the VM, so
make sure that quiesced is enabled.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Logical decoding client has the power to crash the server

2017-09-20 Thread Michael Paquier
On Wed, Sep 20, 2017 at 3:14 PM, Meel Velliste  wrote:
> From what I understand about logical decoding, there is no limit to how many
> log entries will be retained by the server if nobody reads them from the
> logical slot. This means that a client that fails to read from the slot has
> the power to bring down the master database because the server's disk will
> get full at which point all subsequent write operations will fail and even
> read operations will fail because they too need temporary space. Even the
> underlying operating system may be affected as it too may need temporary
> disk space to carry out its basic functions.

Monitoring is a mandatory part of the handling of replication slots.
One possible solution is to use a background worker that scans slots
causing bloat in pg_xlog and to automatically get rid of them so as
the primary is preserved from any crash. Note that advancing a slot is
doable for a physical slot, but advancing a logical slot is trickier
(not sure if that's doable actually but Andres can comment on that)
because it involves being sure that the catalog_xmin is still
preserved so as past logical changes can be looked at consistently.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_rewind issue

2017-09-18 Thread Michael Paquier
On Mon, Sep 18, 2017 at 1:36 PM, James Sewell  wrote:
> When Iook at timeline 15 I see this:
>
> 20  C74/4500no recovery target specified+

This refers to timeline 20. Are there other entries in this history
file? Isn't timeline 15 a direct parent of timeline 20?

> As 00150C7300A9 > 00150C740044  I'm confused why
> this WAL would be required? It's never been created on either server
> (although the same suffix does exist in timeline 14).

Your math is incorrect here. For me 00150C7300A9 <
00150C740044, C73 being generated *before* C74.

> The contents of this email are confidential and may be subject to legal or
> professional privilege and copyright. No representation is made that this
> email is free of viruses or other defects. If you have received this
> communication in error, you may not copy or distribute any part of it or
> otherwise disclose its contents to anyone. Please advise the sender of your
> incorrect receipt of this correspondence.

The contents of this mailing list are public.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] "Canceling authentication due to timeout" with idle transaction and reindex

2017-09-15 Thread Michael Paquier
On Fri, Sep 15, 2017 at 7:25 PM, s19n  wrote:
> Is this expected? I am failing to see the relation between an idle
> transaction in the 'postgres' database, a reindex operation and subsequent
> logins.

REINDEX DATABASE processes as well system indexes, and takes an
exclusive lock on them in order to process. The lock being hold by the
transaction of session 1 conflicts by what REINDEX tries to take, and
REINDEX is able to process only when the index is free from any
lookups. The reason why logins are not possible is this was likely
waiting for a lock of an index of pg_authid which is looked up at
authentication.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_rewind copy so much data

2017-09-15 Thread Michael Paquier
On Fri, Sep 15, 2017 at 2:57 PM, Hung Phan  wrote:
> [...]

Please do not top-post. This breaks the logic of the thread.

> I use ver 9.5.3.

You should update to the latest minor version available, there have
been quite a couple of bug fixes in Postgres since this 9.5.3.

> I have just run again and get the debug log. It is very long so I attach in 
> mail
In this case the LSN where the promoted standby and the rewound node
diverged is clear:
servers diverged at WAL position 2/D69820C8 on timeline 12
rewinding from last common checkpoint at 2/D6982058 on timeline 12
The last segment on timeline 13 is 000D000200E0, which may
be a recycled segment, still that's up to 160MB worth of data...

And from what I can see a lot of the data comes from WAL segments from
past timelines, close to 1.3GB. The rest is more or less completely
coming from relation files from a different tablespace than the
default, tables with OID 16665 and 16683 covering the largest part of
it. What is strange to begin with is that there are many segments from
past timelines. Those should not stick around.

Could you check if the relfilenodes of 16665 and 16683 exist on source
server but do *not* exist on the target server? When issuing a rewind,
a relation file that exists on both has no action taken on (see
process_source_file in filemap.c), and only a set of block are
registered. Based on what comes from your log file, the file is being
copied from the source to the target, not its blocks:
pg_tblspc/16386/PG_9.5_201510051/16387/16665 (COPY)
pg_tblspc/16386/PG_9.5_201510051/16387/16665.1 (COPY)
pg_tblspc/16386/PG_9.5_201510051/16387/16665_fsm (COPY)
And this leads to an increase of the data included in what is rewound.
So aren't you for example re-creating a new database after the standy
is promoted or something like that?
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] looking for a globally unique row ID

2017-09-14 Thread Michael Paquier
On Thu, Sep 14, 2017 at 4:45 PM, Rafal Pietrak  wrote:
> Can anybody help me find a way to implement an ID which:
>
> 1. guarantees being unique across multiple tables.
>
> 2. guarantees its uniqueness not only during INSERT, but also during the
> lifetime of the database/application (e.i. during future UPDATES).
>
> 3. guarantees persistence of value across database backup/restore/upgrade.
> [...]
> I'd appreciate any hints, as I'm bitting my head against this problem
> for a long time, now.

UUID would give you some room for requirement 1. 2 and 3 can be found
with the usage of a serial column.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres DB is failed due to pg_Xlog is continues full.

2017-09-13 Thread Michael Paquier
On Thu, Sep 14, 2017 at 12:44 PM, John R Pierce  wrote:
> prior to that error, something else catastrophic must have happened to the
> system, that error is more of a side effect. recovering a database server
> that far gone which is running such an obsolete version will likely be an
> expensive proposition. before doing anything, you should make a complete
> backup of the $PGDATA directory (and other tablespace directories, if you
> use any).

Definitely take a backup of PGDATA before doing anything! What you
could do is copying its contents to a large disk, and then allow it to
recover from the crash. You are going to need more space at the end.
And yes, upgrade as well. Lagging 7 major releases behind cannot be an
excuse.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_rewind copy so much data

2017-09-13 Thread Michael Paquier
On Wed, Sep 13, 2017 at 2:21 PM, Hung Phan  wrote:
> If I run pg_rewind with debug option, it just show additional bunch of files
> copied in directories like base or pg_tblspc. I claim that there is no data
> inserted of modified from the first step. The only difference between two
> server is caused by restarting old master.

That's exactly this kind of data that would allow to conclude where
the data comes from, and you are not showing it, so it is not possible
to make a clear conclusion. Are known the file paths, the offset
position in the file where a chunk of data is taken, and the chunk
size. If you see for example a bunch of 8kB blocks being copied, this
means that some extra load ran on the master before it was stopped,
and before running the rewind. Are you sure that you don't have large
log files for example?

I would be curious to know what is the version of Postgres 9.5 you are
using as well.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_rewind copy so much data

2017-09-12 Thread Michael Paquier
On Wed, Sep 13, 2017 at 12:41 PM, Hung Phan  wrote:
> I have tested pg_rewind (ver 9.5) with the following scenario:
>
> - one master and one slave (total size of each server is more than 4GB)
> - set wal_log_hint=on and restart both
> - stop master, promote slave
> - start old master again (now two servers have diverged)
> - stop old master, run pg_rewind with progress option

That's a good flow. Don't forget to run a manual checkpoint after
promotion to update the control file of the promoted standby so as
pg_rewind is able to identify the timeline difference between the
source and the target servers.

> The pg_rewind ran successfully but I saw it copied more than 4GB (4265891 kB
> copied). So I wonder there was very minor difference between two servers but
> why did pg_rewind copy almost all data of new master?

Without knowing exactly the list of things that have been registered
as things to copy from the active source to the target, it is hard to
give a conclusion. But my bet here is that you let the target server
online long enough that it had a bunch of block updated, causing more
relation blocks to be copied from the source because more efforts
would be needed to re-sync it. That's only an assumption without data
with clear numbers, numbers that could be found using the --debug
messages of pg_rewind.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] WAL & ready files retained after turning off log shipping

2017-09-12 Thread Michael Paquier
On Tue, Sep 12, 2017 at 11:43 PM, Ron Johnson  wrote:
> On 09/07/2017 09:32 AM, Tom Lane wrote:
>>
>> Ron Johnson  writes:
>>>
>>> On 09/07/2017 09:08 AM, Tom Lane wrote:

 Manual cleanup shouldn't be very hard, fortunately.  Run pg_controldata
 to see where the last checkpoint is, and delete WAL files whose names
 indicate they are before that (but not the one including the
 checkpoint!).

Just noticed. This is actually forgetting that you can only remove
safely WAL segment past the last *prior* checkpoint, not the last
checkpoint. So you need to keep WAL segments worth of two completed
checkpoints.

>>> All WAL files after log shipping was stopped will keep accumulating
>>> "forever"?
>>
>> Hmm ... on second thought, I think if you just remove the .ready/.done
>> files, the next checkpoint should clean up the old WAL files.  That'd
>> certainly be safer than doing it manually.
>
> This weekend, (early Sunday morning) WAL files on the master started
> accumulating again.  Now, .ready files are regenerated every time I delete
> them, even though according to pg_controldate the last checkpoint was 28
> minutes ago.

My general advice here would be: do not mess up with the contents of
the data folder while Postgres is running, you will never do that
right. What is your archiving command telling then? If those .ready
files are here, it means that you are not able to archive correctly
segments. It seems to me that at the end you should try to just set
archive_command = '/bin/true', this would solve all your problems, and
trick the server correctly...
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] contrecord is requested

2017-09-11 Thread Michael Paquier
On Tue, Sep 12, 2017 at 5:27 AM, Scott Marlowe  wrote:
> So we have a db we're trying to rewind and get synced to the master.
> pg_rewind says it doesn't need rewinding, and when we try to bring it
> up, it gets this error:
>
> "contrecord is requested by 2E7/4028"
>
> And fails to get back up.

The consistency recovery point defined by pg_rewind was not reached then.

> Is this a known issue?

Never heard of this one.

> Possible bug in the continuation record code?

That looks like a possibility. This complains is about the fact that
the XLOG page header cannot be found where it should. You may want to
check your WAL segments.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_rewind issue

2017-09-07 Thread Michael Paquier
On Thu, Sep 7, 2017 at 9:06 PM, James Sewell  wrote:
> A client has have been having problems with pg_rewind.
>
> They have two PostgreSQL (Oracle Enterprise Linux 7, 9.6.4) nodes in
> streaming replication and  follow these steps:
>
> 1. Stop the master
> 2. Promote the standby
> 3. After successful failover wait some time (a lot of data is written)
> 4. Issue a checkpoint on the new master
> 5. Issue a pg_rewind on the old master
> 6. Start up the old master with a recovery.conf pointing at the new master.

How are you stopping the primary at step 1)? If you stopped it
cleanly, then it had the occasion to send to the standby the WAL
record corresponding to the shutdown checkpoint, so at step 2) the
standby would use a WAL history that has not forked from the primary
when it is promoted, so the ex-primary could be reused as-is as a
standby. Hence step 5 would not be necessary.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] WAL & ready files retained after turning off log shipping

2017-09-07 Thread Michael Paquier
On Thu, Sep 7, 2017 at 11:08 PM, Tom Lane  wrote:
> Manual cleanup shouldn't be very hard, fortunately.  Run pg_controldata
> to see where the last checkpoint is, and delete WAL files whose names
> indicate they are before that (but not the one including the checkpoint!).
> If you don't intend to do archiving any more, you can just flush all the
> .ready files (and .done if any) without much thought.

It would be less risky to do that as a two-time move:
- First change archive_command to /sbin/true and let all archives be
switched to .done.
- And then disable archive_mode.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] CREATE TABLE LIKE including all not including storage parameters?

2017-09-05 Thread Michael Paquier
On Tue, Sep 5, 2017 at 6:45 PM, Achilleas Mantzios
 wrote:
> Am I doing something wrong here?

>From the documentation:
https://www.postgresql.org/docs/devel/static/sql-createtable.html
STORAGE settings for the copied column definitions will be copied only
if INCLUDING STORAGE is specified. The default behavior is to exclude
STORAGE settings, resulting in the copied columns in the new table
having type-specific default settings. For more on STORAGE settings,
see Section 66.2.

And in this case storage parameters refer to column-specific settings,
not table-level storage parameters, which are defined here by toast:
https://www.postgresql.org/docs/devel/static/storage-toast.html
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] RemoveIPC problem

2017-08-31 Thread Michael Paquier
On Fri, Sep 1, 2017 at 12:10 PM, scott ribe  wrote:
> Yeah, I was kind of thinking that PG detects the semaphore not existing, 
> bails immediately, restarts clean, thus no problem. I just wanted to hear 
> from people, like you, that know way more than I do about the internals.

As long as you don't run Postgres on scissors with things like fsync =
off or full_page_writes = off, there should be no risk with the data
consistency.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table create time

2017-08-31 Thread Michael Paquier
On Fri, Sep 1, 2017 at 12:20 AM, Melvin Davidson  wrote:
> >you could just create an event trigger looking for CREATE TABLE as 
> >filter_value:
>
> I have tried that. Unfortunately, I have been unable to extract the table 
> name from the event because TG_TABLE_NAME is not
> available during an event trigger, albeit perhaps I am missing something?

The function pg_event_trigger_ddl_commands() returns classid and
objid, which should map to respectively pg_class and the relation
created for a CREATE TABLE query, no?
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table create time

2017-08-31 Thread Michael Paquier
On Thu, Aug 31, 2017 at 10:21 PM, Melvin Davidson  wrote:
> Wolfgang, as David said, a column in pg_class for the creation time of a 
> table does not exist. I long ago requested that feature as it is
> in other DB's (Oracle & MS SQL Server), but the main reason that it was not 
> done was that no one was interested in doing it.

Is there any need for a column in pg_class for that? You could just
create an event trigger looking for CREATE TABLE as filter_value:
https://www.postgresql.org/docs/9.6/static/sql-createeventtrigger.html
And then have this event trigger just save the timestamp value of
now() in a custom table with the name and/or OID of the relation
involved.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unlogged Crash Detection

2017-08-29 Thread Michael Paquier
On Tue, Aug 29, 2017 at 11:09 PM, Andres Freund  wrote:
> Huh, but that's not particularly meaningful, is it? That'll just as well
> be the case for a freshly created relation, no?

I have assumed that the OP has some control on the timing of the
relations, using an event trigger for example. There is no perfect
method I am afraid.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unlogged Crash Detection

2017-08-29 Thread Michael Paquier
On Tue, Aug 29, 2017 at 6:06 PM, Gersner  wrote:
> I see, interesting.

Please do not top-post. This is not the recommended way of dealing
with threads on this mailing list.

> We have lots of unlogged tables, upon a crash we want to create a
> feedback/alert that data disappeared.
>
> Not very familiar with the internal structure, but is it possible to
> identify if the current table is the INIT_FORKNUM?

Using pg_relation_filepath, you can know the path to a relation file
on disk. So a simple idea would be to use pg_read_binary_file with the
path of the file and the path of the init fork, which is suffixed with
"_init", and then a comparison between both. If the data read is the
same, the relation has been untouched. Note that you can only do such
a thing as a superuser, and that the data is read from disk, not from
shared buffers. So that's not perfect, but it give an indication.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unlogged Crash Detection

2017-08-29 Thread Michael Paquier
On Tue, Aug 29, 2017 at 5:17 PM, Gersner  wrote:
> Is there a reliable way to distinguish between an empty unlogged table to an
> unlogged table which has been truncated due to a crash?

Why do you want to make such a difference? At the beginning of a crash
recovery all the, the existing relation files of unlogged tables are
all removed, and replaced by the init forknum which represents their
initial state. You can see by yourself ResetUnloggedRelations &
friends in reinit.c.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Extension coverage

2017-08-28 Thread Michael Paquier
On Sat, Aug 26, 2017 at 6:28 PM, Gabriel Furstenheim Milerud
 wrote:
> Not sure I follow. Do you have an example that I could check?
> I have the impression that my problem is that no .gcda files are created. If
> I just run the lcov part:
> lcov -d . -c -o lcov.info
>
> I get
>Found gcov version: 5.4.0
>Scanning . for .gcda files ...
>geninfo: WARNING: no .gcda files found in . - skipping!
>Finished .info-file creation
>
> and an empty file lcov.info is created.
>
> Just to check, I've tried running make coverage and make coverage-html on
> the folder of postgres instead of the extension and that works.

There is no need to complicate your Makefile with a custom coverage
target, which is, at least it seems to me, the origin of the problems
you are seeing here. So you could just use the one that Postgres' PGXS
provides. Here is a simple Makefile I have used for coverage testing
with an extension:
https://github.com/michaelpq/pg_plugins/blob/master/decoder_raw/Makefile
If you enforce abs_top_srcdir=$(pwd) with make coverage, or
coverage-html if you want, then both are able to work properly. At
least for me they do.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Retrieving query results

2017-08-28 Thread Michael Paquier
On Sun, Aug 27, 2017 at 12:12 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paqu...@gmail.com> writes:
>> On Fri, Aug 25, 2017 at 8:10 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>>> I think the real problem occurs where we realloc the array bigger.
>
>> Looking at the surroundings, I think that it would be nice to have
>> pqAddTuple and PQsetvalue set an error message with this patch.
>
> Yeah, I was thinking about that myself - the existing design presumes
> that the only possible reason for failure of pqAddTuple is OOM, but
> it'd be better to distinguish "too many tuples" from true OOM.  So
> we should also refactor to make pqAddTuple responsible for setting
> the error message.  Might be best to do the refactoring first.

Attached are two patches:
1) 0001 refactors the code around pqAddTuple to be able to handle
error messages and assign them in PQsetvalue particularly.
2) 0002 adds sanity checks in pqAddTuple for overflows, maximizing the
size of what is allocated to INT_MAX but now more.

pqRowProcessor() still has errmsgp, but it is never used on HEAD. At
least with this set of patches it comes to be useful. We could rework
check_field_number() to use as well an error message string, but I
have left that out to keep things simple. Not sure if any complication
is worth compared to just copying the error message in case of an
unmatching column number.

Attached is as well a small program I have used to test PQsetvalue
through PQcopyResult to see if results get correctly allocated at each
call, looking at the error message stacks on the way.
-- 
Michael
/*
 * Script to test PQcopyResult and subsequently PQsetvalue.
 * Compile with for example:
 * gcc -lpq -g -o pg_copy_res pg_copy_res.c
 */

#include 
#include 
#include "libpq-fe.h"

#define DEFAULT_PORT	"5432"
#define DEFAULT_HOST	"/tmp"
#define DEFAULT_DB		"postgres"

int
main()
{
	char *port = getenv("PGPORT");
	char *host = getenv("PGHOST");
	char *dbname = getenv("PGDATABASE");
	char connstr[512];
	PGconn *conn;
	PGresult *res, *res_copy;

	if (port == NULL)
		port = DEFAULT_PORT;
	if (host == NULL)
		host = DEFAULT_HOST;
	if (dbname == NULL)
		dbname = DEFAULT_DB;

	snprintf(connstr, sizeof(connstr), "port=%s host=%s dbname=%s",
			 port, host, dbname);

	conn = PQconnectdb(connstr);

	if (PQstatus(conn) != CONNECTION_OK)
	{
		fprintf(stderr, "Connection to database failed: %s",
PQerrorMessage(conn));
		return 1;
	}

	res = PQexec(conn, "SELECT 1");

	/* Copy the resuld wanted, who care what that is... */
	res_copy = PQcopyResult(res, PG_COPYRES_TUPLES | PG_COPYRES_ATTRS);

	PQclear(res);
	PQclear(res_copy);

	PQfinish(conn);
	return 0;
}


0001-Refactor-error-message-handling-in-pqAddTuple.patch
Description: Binary data


0002-Improve-overflow-checks-of-pqAddTuple-in-libpq.patch
Description: Binary data

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Retrieving query results

2017-08-26 Thread Michael Paquier
On Fri, Aug 25, 2017 at 8:10 AM, Tom Lane  wrote:
> I think the real problem occurs where we realloc the array bigger.
> tupArrSize needs to be kept to no more than INT_MAX --- and, ideally,
> it should reach that value rather than dying on the iteration after
> it reaches 2^30 (so that we support resultsets as large as we possibly
> can).  Without a range-check, it's not very clear what realloc will think
> it's being asked for.  Also, on 32-bit machines, we could overflow size_t
> before tupArrSize even gets that big, so a test against
> SIZE_MAX/sizeof(pointer) may be needed as well.
>
> As long as we constrain tupArrSize to be within bounds, we don't
> have to worry about overflow of ntups per se.

I just poked more seriously at this code, and we could use something like that:
@@ -868,6 +868,16 @@ pqAddTuple(PGresult *res, PGresAttValue *tup)
int newSize = (res->tupArrSize > 0) ? res->tupArrSize * 2 : 128;
PGresAttValue **newTuples;

+   if (res->tupArrSize == INT_MAX)
+   return FALSE;
+   if (new_size == INT_MIN)
+   new_size = INT_MAX;
+   if (newSize > SIZE_MAX / sizeof(PGresAttValue *))
+   return FALSE;

Looking at the surroundings, I think that it would be nice to have
pqAddTuple and PQsetvalue set an error message with this patch. The
user can see now that those would only properly report on OOM, but if
we add more types of errors proper error messages would be nice for
users.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Extension coverage

2017-08-25 Thread Michael Paquier
On Sat, Aug 26, 2017 at 2:59 AM, Gabriel Furstenheim Milerud
 wrote:
> The I built my extension against this postgres server
>
>$ cd $EXTENSION_SOURCE_DIR
>$ make install && make installcheck
>
> the extension is built fine and the test run. However, I see no difference
> with a normal installation without the flag.
>
> Then I run the following target
> (https://stackoverflow.com/questions/31417900/how-to-run-coverage-report-on-a-postgres-extension)
>coverage:
>lcov --no-external -d .  -c -o lcov.info
>genhtml --show-details --legend --output-directory=coverage
> --title=PostgreSQL --num-spaces=4 --prefix=./src/ `find . -name lcov.info
> -print`
>
> But it complains that there are no .gcda files
> geninfo: WARNING: no .gcda files found in . - skipping!
> and it errors.
>genhtml: ERROR: no valid records found in tracefile ./lcov.info
>
> Is this the way to measure coverage on an extension? Am I missing some step?

This error is a bit weird, what you are showing here has nothing
wrong. I do work with this coverage target and postgres extensions,
and the following steps allow me to generate a text report of coverage
when compiling any extension with a Postgres build having
--enable-coverage:
make install
make installcheck
make coverage
Trying to use coverage-html from an external extension leads to an error:
$ make coverage-html
rm -rf coverage
mkdir coverage
/home/ioltas/extra/bin/genhtml --show-details --legend
--output-directory=coverage --title=PostgreSQL --num-spaces=4
--prefix= `find . -name lcov.info -print`
genhtml: Option prefix requires an argument
Use genhtml --help to get usage information

But this bit is easy enough to fix by enforcing the value of
abs_top_srcdir when running the command "make coverage-html".
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Retrieving query results

2017-08-24 Thread Michael Paquier
On Thu, Aug 24, 2017 at 11:56 PM, Tom Lane  wrote:
> I haven't tried it, but it sure looks like it would, if you don't hit
> OOM first.  pqAddTuple() isn't doing anything to guard against integer
> overflow.  The lack of reports implies that no one has ever tried to
> retrieve even 1G rows, let alone more ...

Yeah, looking at the code we would just need to check if ntups gets
negative (well, equal to INT_MIN) after being incremented.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_rewind - restore new slave failed to startup during recovery

2017-08-22 Thread Michael Paquier
On Tue, Aug 22, 2017 at 11:39 PM, Magnus Hagander <mag...@hagander.net> wrote:
> On Tue, Aug 22, 2017 at 3:06 AM, Michael Paquier <michael.paqu...@gmail.com>
> wrote:
>> That flow looks correct to me. No I think that you should trigger
>> manually a checkpoint after step 2 on the promoted standby so as its
>> control file gets forcibly updated correctly with its new timeline
>> number. This is a small but critical point people usually miss. The
>> documentation of pg_rewind does not mention this point when using a
>> live source server, and many people have fallen into this trap up to
>> now... We should really mention that in the docs. What do others
>> think?
>
> If the documentation is missing such a clearly critical step, then I would
> say that's a definite documentation bug and it needs to be fixed. We can't
> really fault people for missing a small detail if we didn't document the
> small detail...

What do you think about the attached? I would recommend a back-patch
down to 9.5 to get the documentation right everywhere but I think as
well that this may not be enough. We could document as well an example
of a full-fledged failover flow in the Notes, in short:
1) Promote a standby.
2) Stop the old master cleanly. If it has been killed atrociously,
make it finish recovery once and then stop it so as its WAL data is
ahead of the point WAL has fork after the promotion (shutdown
checkpoint record is at least here).
3) Prepare source server for the rewind.
3-1) Using file copy, stop the source server (promoted standby) cleanly first.
3-2) Using SQL, issue a checkpoint on the source server to update its
control file and making sure that the timeline number is up-do-date on
disk.
4) Perform the actual rewind. This will need WAL segments on the
target from the point WAL has forked to the shutdown checkpoint record
created at step 2).
5) Create recovery.conf on the target and point it to the source for
streaming, or archives. Then let it perform recovery.
-- 
Michael


rewind-checkpoint-doc.patch
Description: Binary data

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Retrieving query results

2017-08-22 Thread Michael Paquier
On Wed, Aug 23, 2017 at 3:19 AM, Igor Korot  wrote:
> [quote]
> PQntuples
>
> Returns the number of rows (tuples) in the query result. Because it
> returns an integer result, large result sets might overflow the return
> value on 32-bit operating systems.
>
> int PQntuples(const PGresult *res);
>  [/quote]
>
> Is there another way to not to overflow the result?

Not really with the existing API. Note that getting at 2 billion rows
is really a lot, and would cause performance issues on the application
side because a bunch of data would need to be processed, and getting
out this much data is not network-wise anyway.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [BUGS] Fwd: PostgreSQL 9.4.13 is facing issue in shutting down

2017-08-21 Thread Michael Paquier
On Tue, Aug 22, 2017 at 1:58 PM, Abhijit Gharami
 wrote:
> Recently we have updated our PostgreSQL version from 9.4.12 to 9.4.13. With
> version 9.4.13 while trying to stop the PostgreSQL server, database is
> facing issues in shutting down. The problem probably occurs because of the
> WAL sender process not getting terminated on shutdown request.
> ps output:
> postgre+  7915  0.0  1.9 589772 39528 ?S<   09:11   0:00
> //bin/postgres -D /.../postgresql -h 0.0.0.0 -p 5432
> postgre+  8617  0.0  0.4 590760  8344 ?S wal sender process replicator xx.xx.xx.xxx(45948) streaming 0/490

One change introduced between 9.4.12 and 9.4.13 is the way WAL senders
are stopped at shutdown. The checkpointer will first send a signal to
all the WAL senders after all the normal backends have exited before
beginning to issue the checkpoint shutdown. This will switch the WAL
senders to a stopping state where all new queries are blocked, and all
WAL senders exit after they have sent any remaining WAL to the
clients, including the checkpoint shutdown record. So, what is the
client behind this WAL sender? It seems that this WAL sender is
waiting for some confirmation activity.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_rewind - restore new slave failed to startup during recovery

2017-08-21 Thread Michael Paquier
On Tue, Aug 22, 2017 at 9:52 AM, Dylan Luong  wrote:
> I have 1 master and 1 slave wal streaming replication setup and the
> Application connects via a load balancer (LTM) where the all connections are
> redirected to the master member (master db).
>
> We have archive_mode enabled.

First things first. What is the version of PostgreSQL involved here?

> I am trying to test to use pg_rewind to restore the new slave (old master)
> after a failover while the system is under load.

Don't worry. pg_rewind works :)

> Here are the steps I take to test:
>
> 1.   Disable the master ltm member (all connections redired to slave
> member)
> 2.   Promote slave (touch promote.me)
> 3.   Stop the master db (old master)
> 4.   Do pg_rewind on the new slave (old master)
> 5.   Start the new slave.

That flow looks correct to me. No I think that you should trigger
manually a checkpoint after step 2 on the promoted standby so as its
control file gets forcibly updated correctly with its new timeline
number. This is a small but critical point people usually miss. The
documentation of pg_rewind does not mention this point when using a
live source server, and many people have fallen into this trap up to
now... We should really mention that in the docs. What do others
think?

> Checking the on the new master, I see that the check point that its trying
> to restore is the file 0004009C006F, but the file does not exist
> anywhere on the new master. Not in the pg_xlog or the archive folder. (as
> specified in the postgresql.conf)

4 is the number of the last timeline the promoted standby has been using, right?

> Please see attached  psql.jpg.
>
> Here is my recovery.conf :
> standby_mode = 'on'
> primary_conninfo = 'host=10.69.19.18  user=replicant’
> trigger_file = '/var/run/promote_me'
> restore_command = 'cp /pg_backup/backup/archive_sync/%f "%p"'
>
> does anyone know why?

What are the contents of /pg_backup/backup/archive_sync/? Are you sure
that the promoted standby has archived correctly the first segment of
its new timeline for example?

> Under what conditions will pg_rewind wont’ work?

Only one WAL segment missing would prevent any base backup or rewound
node to reach a consistent point. You need to be careful about the
contents of your archives. Now a failover done correctly is a tricky
thing, which could likely fail if you don't issue a checkpoint
immediately on the promoted standby if pg_rewind is kicked in the
process before an automatic checkpoint happens (because of timeout or
volume, whichever).
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deleting unwanted wal files

2017-08-17 Thread Michael Paquier
On Fri, Aug 18, 2017 at 12:43 AM, krishna chaitanya
 wrote:
> Thanks for your reply,

Please do not top-post, this is not the style of this mailing list.

> but will pg_basebackup generate a .backup file when
> scheduled in cron job so that i can give that as input to pg_archivecleanup.

Yes.

> Also if i give archive_cleanup_command in recovery.conf will it check the
> presence of recovery.conf file automatically and execute the command from
> that file ?

You can feed a backup history file name to pg_archivecleanup, it will
then reuse the prefix of this file name. archive_cleanup_command is
part of recovery.conf, which gets loaded by the server at the
beginning of recovery by the startup process, so the command will get
executed continuously on a standby.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deleting unwanted wal files

2017-08-17 Thread Michael Paquier
On Thu, Aug 17, 2017 at 8:06 PM, krish050591  wrote:
> Hi, if i'm using pg_basebackup utility for taking my database backup and also
> enabled wal level archiving, how will i detect the unwanted wal files and
> how will it delete them ?

Have you heard of pg_archivecleanup? Documentation is here:
https://www.postgresql.org/docs/devel/static/pgarchivecleanup.html
If a single archive is not cross-used among multiple standbys, you
could use it with archive_cleanup_command is recovery.conf to remove
unneeded WAL segments.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Begginers question

2017-08-16 Thread Michael Paquier
On Wed, Aug 16, 2017 at 2:32 PM, Alex Samad  wrote:
> 1) why did it fill up this time and not previously
> I add this
> archive_command = '/bin/true'
> wal_keep_segments = 1000 # <<< I'm guessing its this
>
> 2) how do I fix up, can I just remove the files from the pg_xlog directory

Don't do that. those files are managed by Postgres so you may finish
with a corrupted cluster. Instead you should lower the value of
wal_keep_segments, reload the server parameters, and then enforce two
checkpoints to force WAL segments to be recycled. Note that this
depends also on the values of checkpoint_segments
(max_wal_size/min_wal_size in Postgres 9.5 and onwards).
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] WAL replication wrong collate

2017-08-15 Thread Michael Paquier
On Tue, Aug 15, 2017 at 4:45 AM, basti  wrote:
> i have fixed. pg_update has create a wrong cluster

Let's be sure that we are not talking about a bug here, because you
are giving no details so it is hard to know if what you are seeing is
caused by an incorrect operation, or if that's an actual bug. So,
could you answer the following questions?
1) Are you referring to pg_upgrade instead of pg_update?
2) How did you do the upgrade of the standby server?
3) What are the original version of the server and the new version?
4) What is the exact order of the operations you did that made the
state you saw happen?
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] WAL replication wrong collate

2017-08-14 Thread Michael Paquier
On Tue, Aug 15, 2017 at 3:52 AM, basti  wrote:
> master and slave had set the same locales.
> I dont unterstand that i can create a database in en_us.utf8 and then when i
> did the basebackup it's change to c locale.
> I cant find any option for pg_basebackup to set locale/collate.
> I use this howto https://wiki.postgresql.org/wiki/Streaming_Replication and
> had install an other cluster in the past with successful.
>
> How can help please?

Are you sure that you are connecting to the right standby server? The
creation of a database is WAL-logged in roughly two phases:
- Log the new pg_database record.
- Copy the directory of the template database to the new database.
So collations are conserved at replay.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_stat_statements -- Historical Query

2017-08-09 Thread Michael Paquier
On Thu, Aug 10, 2017 at 6:23 AM, anand086  wrote:
> I was looking for a way to maintain historical query details in Postgres to
> answer questions like
>
> What was the sql call rate between time X and Y?
> Did the execution count increase for the query increase between time X and
> Y?
> In past 10mins what all queries were run in the db?
>
> and few others like this.
>
> What would be best way to do it? Any thoughts?

pg_stat_statements has a function allowing to reset what the view
pg_stat_statements holds as information. You could copy periodically
the data of pg_stat_statements and then invoke
pg_stat_statements_reset to put everything back to zero. Then you
would just need to do your analysis work based on the amount of data
copied into your custom table.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Lifetime of PQexecPrepared() returned value

2017-08-04 Thread Michael Paquier
On Fri, Aug 4, 2017 at 9:12 PM, Igor Korot  wrote:
> Am I missing something? How do I fix the crash?

Based on what I can see here, I see nothing wrong. Now it is hard to
reach any conclusion with the limited information you are providing.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] standby database crash

2017-08-01 Thread Michael Paquier
On Mon, Jul 31, 2017 at 11:15 PM, Seong Son (US)  wrote:
> So my questions are, could an old WAL segment being resent through the
> network cause crash like this?  Shouldn’t Postgresql be able to handle out
> of order WAL segments instead of just crashing?

When the streaming connection between a standby and a primary is cut,
the WAL receiver would restart and try to stream from the beginning of
the last segment it was in the middle of. See RequestXLogStreaming in
walreceiverfuncs.c.

> And what would be the best way to recover the standby server?  Resynching
> the entire database seems to be too time consuming.

You may want to check the validity of the so-said WAL segment as well.
Corrupted data could come from it.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] vacuum on streaming replication

2017-07-31 Thread Michael Paquier
On Mon, Jul 31, 2017 at 7:28 AM, Andreas Kretschmer
 wrote:
> The standby is read only, vacuum runs on the master and replicated to the 
> standby. Analyse as well.

Please note as well that if hot_standby_feedback is enabled, the
cleanup done by VACUUM on the primary is influenced as well so as
tuples that a standby may need to avoid conflicts for its transactions
are not removed. So VACUUM may result in less cleanup depending on the
read load on the standby.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Fwd: getting error while parsing log file using pgbadger

2017-07-26 Thread Michael Paquier
On Wed, Jul 26, 2017 at 2:01 PM, PAWAN SHARMA  wrote:
> For root user its working fine
>
> [root@abc :/opt/PostgreSQL/9.5/data/pgaudit]#
> #-> pgbadger -f stderr postgres-2017-07-26_00.csv  -o abc.html
> [==>  ] Parsed 1873673 bytes of 1991998 (94.06%),
> queries: 0, events: 845
> LOG: Ok, generating html report...
>
> but for the Postgres user, its showing the below error.

There is no way to be sure what's wrong as this depends on the
environment you are using and the packaging system that you have, but
at short glance, I think that you should first try to update PERL5LIB
so as it points to the location where the module has been installed.
Good luck! It does not sound complicated to me to address anyway.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Fwd: getting error while parsing log file using pgbadger

2017-07-26 Thread Michael Paquier
On Wed, Jul 26, 2017 at 10:59 AM, PAWAN SHARMA
 wrote:
>
> Hi All,
>
> I am facing below error while parsing log file.
>
> [postgres@abc pgaudit]$ pgbadger -f stderr postgres-2017-07-25_121445.csv
> Can't locate Text/CSV_XS.pm in @INC (@INC contains: /usr/local/lib64/perl5
> /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl
> /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at
> /bin/pgbadger line 2620.
>
> Please suggest..!!

This means that you need to install the perl module Text::CVS_XS:
http://search.cpan.org/~hmbrand/Text-CSV_XS-1.31/CSV_XS.pm
This is defined in the requirements of pgbadger's README when parsing csv files:
https://github.com/dalibo/pgbadger/blob/master/README
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Monitoring of a hot standby with a largely idle master

2017-07-26 Thread Michael Paquier
On Mon, Jul 24, 2017 at 9:08 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Sun, Jul 16, 2017 at 8:47 AM, Michael Paquier <michael.paqu...@gmail.com>
> wrote:
>> What do you think about the patch attached?
>
> Looks OK.  Should it mention specifically "On a hot standby" rather than "On
> a standby"?  Otherwise people might be left confused on how they are
> supposed to do this on a generic standby.  It is the kind of thing which is
> obvious once you know it, but confusing the first time you encounter it.

Yes, right. Let's update as you suggest.

>>  
>>   You can retrieve a list of WAL sender processes via the
>> - 
>> + 
>>   pg_stat_replication view. Large differences
>> between
>> In the previous paragraph I have noticed that the link reference is
>> incorrect. pg_stat_replication is listed under
>> monitoring-stats-dynamic-views-table.
>
> Yes, that is clearly wrong.  But why not link directly to the description of
> the view itself, pg-stat-replication-view, rather than the correct table
> which mentions the view?  Is that the accepted docs style to link to the
> more generic place?  (Same thing applies to your patch, it could link
> directly to pg-stat-wal-receiver-view.

Yes, that's even better.

> Sorry for the delay, it took me awhile to get the new doc build system to
> work (solution seems to be, "Don't use CentOS6 anymore")

:)
Thanks for the review.
-- 
Michael


walreceiver-doc-v2.patch
Description: Binary data

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump not dropping event trigger

2017-07-23 Thread Michael Paquier
On Sun, Jul 23, 2017 at 5:48 PM, Tom Lane  wrote:
> Greg Atkins  writes:
>> would you like a bug report to track this?
>
> No, it's already dealt with.  In any case, your original email was good
> enough --- we track bugs these days more by message-ID than anything else.

This has been fixed by the following commit, which will be present in
the next round of minor releases planned for the second week of August
(https://www.postgresql.org/developer/roadmap/):
https://git.postgresql.org/pg/commitdiff/93f039b4944fdf806f029ed46cf192bc9021d8e7
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Logging at schema level

2017-07-21 Thread Michael Paquier
On Fri, Jul 21, 2017 at 8:21 AM, John R Pierce  wrote:
> if you have per schema logging, where should that get logged ?
>
> you could implement per DATABASE logging, if you A) add the database name to
> the log_prefix, and B) feed your logs to a program that understands this and
> splits them out to a log file per database.you could also do this on a
> per user basis. but, schema is something very dynamic, its a namespace
> within a database, and queries can touch multiiple schemas.

Personally, I understand that as logging the query N times, once per
schema, if it touches N schemas, making the exercise part of parsing.
I think that it would be possible to use the parser hook to achieve
that actually, as you need extra lookups for WITH clauses and such.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Streaming Replication archive_command is really needed?

2017-07-21 Thread Michael Paquier
On Fri, Jul 21, 2017 at 8:15 AM, Andreas Kretschmer
<andr...@a-kretschmer.de> wrote:
> Am 21.07.2017 um 08:01 schrieb Michael Paquier:
>> "No" is not completely exact and lacks in details. There are two cases
>> where having an archive is helpful:
>> 1) The standby has disconnected from its primary for a time long
>> enough that WAL segments have been rotated by two completed
>> checkpoints. If that happens, when the standby reconnects it would
>> fail, and you would need to take a new base backup.
>
> you can prevent that using replication slots, but i'm pretty sure you
> (Michael) knows that ;-)
> http://paquier.xyz/postgresql-2/postgres-9-4-feature-highlight-replication-phydical-slots/

There is a typo on my URL here. Well that's too late to fix it even if
that's bad style.

>> 2) Backup strategies. Keeping a larger history set of WAL segments is
>> helpful for incremental backups, which is partially the point actually
>> raised upthread about PITR.
>
> Ack, that's right. Using both (streaming and wal-shipping/archiving) will
> make it more robust, and you have (with archiving) the posibility for PITR.
> BUT, you can build a streaming replication without archiving, even you can
> build a continuous backup using only streaming (Barman, streaming only
> mode).

Backup solutions developed by experts on the topic are paths to reliability.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Streaming Replication archive_command is really needed?

2017-07-21 Thread Michael Paquier
On Thu, Jul 20, 2017 at 10:07 PM, Leonardo M. Ramé  wrote:
> El 20/07/17 a las 16:57, Andreas Kretschmer escribió:
>> On 20 July 2017 21:46:09 GMT+02:00, "Leonardo M. Ramé"
>>  wrote:
>>>
>>> Hi, I wonder if archive_mode=on and archive_command parameters in
>>> postgresql.conf are really needed for streaming replication between two
>>>
>>> servers (master-slave).
>>
>> No.
>
> So, can I just comment those commands and remove the main/archive directory?

"No" is not completely exact and lacks in details. There are two cases
where having an archive is helpful:
1) The standby has disconnected from its primary for a time long
enough that WAL segments have been rotated by two completed
checkpoints. If that happens, when the standby reconnects it would
fail, and you would need to take a new base backup.
2) Backup strategies. Keeping a larger history set of WAL segments is
helpful for incremental backups, which is partially the point actually
raised upthread about PITR.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Logging at schema level

2017-07-20 Thread Michael Paquier
On Fri, Jul 21, 2017 at 7:10 AM, Nikhil  wrote:
> I am using postgresql schema feature for multi-tenancy. can we get
> postgresql logs at schema level. Currently it is for the whole database
> server (pg_log)

Not directly. All the log-related parameters can be controlled by a
superuser, so you could leverage things at application level with some
SET commands to switch them dynamically.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Two-phase commit case studies

2017-07-19 Thread Michael Paquier
On Wed, Jul 19, 2017 at 7:10 PM, Gavin Wahl  wrote:
> I've read the documentation for two-phase commit and it sounds interesting, 
> but
> I'm having trouble thinking of how to actually put it into production. How are
> people using it? Do you use a XA transaction manager or something custom 
> built?
> How dos the reconciliation work when a component crashes between PREPARE and
> COMMIT? Is anyone using it in the open-source world where transaction managers
> are unheard of?
>
> Looking forward to hearing about how two-phase commit has helped you.

One direct use of 2PC, which is for what it has been originally
designed, is to ensure the consistency of a transaction commit across
multiple servers. One thing using extensively 2PC is for example
Postgres-XL (formerly Postgres-XC that introduced the concept), to
commit a transaction across nodes when a transaction involves writes
to multiple nodes.

Postgres JDBC has XA support by the way:
https://jdbc.postgresql.org/documentation/faq.html#xa-support
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Monitoring of a hot standby with a largely idle master

2017-07-16 Thread Michael Paquier
On Fri, Jul 14, 2017 at 9:11 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Thu, Jul 13, 2017 at 10:38 AM, Michael Paquier
> <michael.paqu...@gmail.com> wrote:
>>
>> On Thu, Jul 13, 2017 at 7:23 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
>> >
>> > I think that pg_stat_wal_receiver should be crossreferenced in
>> > https://www.postgresql.org/docs/9.6/static/hot-standby.html, near the
>> > same
>> > place which it crossreferences table 9-79.  That would make it more
>> > discoverable.
>>
>> Hm. Hot standby may not involve streaming replication. What about a
>> paragraph here instead?
>>
>> https://www.postgresql.org/docs/devel/static/warm-standby.html#streaming-replication
>>
>> In the monitoring subsection, we could tell that on a standby the WAL
>> receiver status can be retrieved from this view when changes are
>> streamed. What do you think?
>
>
> That works for me.

What do you think about the patch attached?

 
  You can retrieve a list of WAL sender processes via the
- 
+ 
  pg_stat_replication view. Large differences between
In the previous paragraph I have noticed that the link reference is
incorrect. pg_stat_replication is listed under
monitoring-stats-dynamic-views-table.
-- 
Michael


walreceiver-doc.patch
Description: Binary data

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What is exactly a schema?

2017-07-14 Thread Michael Paquier
On Fri, Jul 14, 2017 at 12:00 PM, marcelo  wrote:
> The question is not trivial. Could I maintain two or three separate/distinct
> "versions" of same database using one schema for every of them?
> Could some tables (in the public schema) be shared among all the schemas?

The documentation gives a good starting point:
https://www.postgresql.org/docs/9.6/static/ddl-schemas.html
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Monitoring of a hot standby with a largely idle master

2017-07-13 Thread Michael Paquier
On Thu, Jul 13, 2017 at 7:23 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Thu, Jul 13, 2017 at 1:15 AM, Michael Paquier <michael.paqu...@gmail.com>
> wrote:
>>
>> On Thu, Jul 13, 2017 at 5:26 AM, Jeff Janes <jeff.ja...@gmail.com> wrote:
>> >
>> > I think that none of the recovery information functions
>> >
>> > (https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE)
>> > can distinguish a hot standby which is connected to an idle master,
>> > versus
>> > one which is disconnected.  For example, because the master has crashed,
>> > or
>> > someone has changed the firewall rules.
>> >
>> > Is there a way to monitor from SQL the last time the standby was able to
>> > contact the master and initiate streaming with it?  Other than trying to
>> > write a function that parses it out of pg_log?
>>
>> Not directly I am afraid. One way I can think about is to poll
>> periodically the state of pg_stat_replication on the primary or
>> pg_stat_wal_receiver on the standby and save it in a custom table. The
>> past information is not persistent as any replication-related data in
>> catalogs is based on the shared memory state of the WAL senders and
>> the WAL receiver, and those are wiped out at reconnection.
>
>
> Thanks, that looks like what I want (or will be, once I get the other side
> to upgrade to 9.6).
>
> I think that pg_stat_wal_receiver should be crossreferenced in
> https://www.postgresql.org/docs/9.6/static/hot-standby.html, near the same
> place which it crossreferences table 9-79.  That would make it more
> discoverable.

Hm. Hot standby may not involve streaming replication. What about a
paragraph here instead?
https://www.postgresql.org/docs/devel/static/warm-standby.html#streaming-replication

In the monitoring subsection, we could tell that on a standby the WAL
receiver status can be retrieved from this view when changes are
streamed. What do you think?
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Associating a basebackup and it's .backup WAL file

2017-07-13 Thread Michael Paquier
On Thu, Jul 13, 2017 at 10:30 AM, cen  wrote:
> Given a basebackup base.tar.gz and an archive of WAL files, is there any way
> to find out which .backup WAL file is associated with the basebackup from
> command line?

Not from what Postgres ships directly. Without any custom meta data
save with each one of your backups, say something that you write after
calling pg_basebackup, you would need to untar base.tar to look for
the backup_label file.

> My use case is for a retention policy bash script which:
> -deletes all basebackups older than X days
> -runs pg_archivecleanup for the oldest basebackup
>
> I just don't know how to find out which WAL to feed to pg_archivecleanup at
> this point.

Recalling something I know about, pg_rman uses its own meta data to do
this decision making with dedicated folder names that use a structure
and names based on timestamps, and this meta data is written and saved
when each backup is taken. This saves future lookups at all tarballs
when doing cleanup of past backups.

I am not sure about the more popular barman and pgBackrest since I
know them less, but I would imagine they handle retention policies
similarly.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Monitoring of a hot standby with a largely idle master

2017-07-13 Thread Michael Paquier
On Thu, Jul 13, 2017 at 5:26 AM, Jeff Janes  wrote:
>
> I think that none of the recovery information functions
> (https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE)
> can distinguish a hot standby which is connected to an idle master, versus
> one which is disconnected.  For example, because the master has crashed, or
> someone has changed the firewall rules.
>
> Is there a way to monitor from SQL the last time the standby was able to
> contact the master and initiate streaming with it?  Other than trying to
> write a function that parses it out of pg_log?

Not directly I am afraid. One way I can think about is to poll
periodically the state of pg_stat_replication on the primary or
pg_stat_wal_receiver on the standby and save it in a custom table. The
past information is not persistent as any replication-related data in
catalogs is based on the shared memory state of the WAL senders and
the WAL receiver, and those are wiped out at reconnection.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] debugging SSL connection problems

2017-07-11 Thread Michael Paquier
On Tue, Jul 11, 2017 at 6:32 AM, Magnus Hagander  wrote:
> On Mon, Jul 10, 2017 at 11:19 PM, Jeff Janes  wrote:
>> Is there a way to get libpq to hand over the certificate it gets from the
>> server, so I can inspect it with other tools that give better diagnostic
>> messages?  I've tried to scrape it out of the output of "strace -s8192", but
>> since it is binary it is difficult to figure out where it begins and ends
>> within the larger server response method.
>>
>
> PQgetssl() or PQsslStruct() should give you the required struct from
> OpenSSL, which you can then use OpenSSL to inspect. You should be able to
> use (I think) SSL_get_peer_certificate() to get at it.

Yes that will work. The SSL context stored in PGconn offers enough
entry point to access all the SSL-related data.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_start/stop_backup non-exclusive scripts to snapshot

2017-07-05 Thread Michael Paquier
On Wed, Jul 5, 2017 at 10:47 PM, Melvin Davidson  wrote:
> Your criticism is noted, however, I have used it many times in the past with 
> absolutely no problem.

Plug off the server on which is stored the backup just after your
script finishes, you have a good chance to be surprised if you try to
restore from this backup later on.

> I submitted that script as a possible solution
> to the op's problem/question. If you have an alternate solution or can make 
> improvements to it, then I am sure the op and I would welcome them.

Stephen has mentioned two of them, with hundreds of man hours spent in
developing those backup tools to be robust solutions, done by
specialists on the matter.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] duplicate key value violates unique constraint and duplicated records

2017-07-02 Thread Michael Paquier
On Mon, Jul 3, 2017 at 10:08 AM, Steven Chang  wrote:
> Hello :

Please avoid top-posting.

>PG  VERSION : PPAS 9.3 , enterprisedb
>os   version :  2.6.32-358.el6.x86_64

This is EnterpriseDB's fork of Postgres. Until it can be proved that a
corruption has happened using the community code, it is going to be
hard to say if the problem comes from PostgreSQL itself or from
something that has been changed there.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Suddenly - LOG: could not open file "postmaster.pid": No such file or directory

2017-06-22 Thread Michael Paquier
On Fri, Jun 23, 2017 at 1:43 PM, Tom Lane  wrote:
> Muhammad Hanif Abdul Hamid  writes:
>> Within this week, I have faced two times "postmaster.pid" went missing in a 
>> sudden. Nobody deleted it or shutdown the server.
>
> I'd lay very long odds that your second statement is false.  *Something*
> removed that file, and it wasn't the postmaster.

Yup, and the postmaster is designed to stop if it finds out that
postmaster.pid is removed. See that:
commit: 7e2a18a9161fee7e67642863f72b51d77d3e996f
author: Tom Lane 
date: Tue, 6 Oct 2015 17:15:52 -0400
Perform an immediate shutdown if the postmaster.pid file is removed
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Error with DefineSavepoint:Unexpected state STARTED

2017-06-19 Thread Michael Paquier
On Tue, Jun 20, 2017 at 11:12 AM, Computer Study
 wrote:
> The UI application is to add/remove some permissions through a listbox  for
> a certain user. For any add/remove, it will first delete all records for
> this certain user in the table, then insert the values chose from UI.
>
> In my code snippet, it first execute the DELETE, then execute multiple
> INSERT to insert selected options into the table. After all done, commit.
>
> The server is in remote and I don't have detailed logs at this moment. Will
> try to ask for that.
>
> My question:  for that error 'DefineSavepoint: unexpected state STARTED',
> not quite sure happens in which step? In the middle of DELETE or INSERT?

This error has been around for a while, and funnily you are the second
reporter of this issue within a couple of weeks:
https://www.postgresql.org/message-id/0A3221C70F24FB45833433255569204D1F6BE40D@G01JPEXMBYT05
I have been thinking a bit about how to fix that, and wondered about
using a new transaction status to track that, but that finished by
being rather intrusive..
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication slot and pg_rewind

2017-06-05 Thread Michael Paquier
On Tue, Jun 6, 2017 at 1:52 PM, Bhattacharyya, Subhro
 wrote:
> Our expectation is that slave will be able to sync with the new master with 
> the help of whatever WALs are present in the new master due to replication 
> slots.
> Can pg_rewind still work without WAL archiving in this scenario.

I see. Yes, the slot on the old primary would keep retaining WAL, and
the promoted standby would stop sending feedback once it has switched
to a new timeline so that should work. Don't forget to drop the drop
on the old primary after pg_rewind has been run, you don't want to
bloat its pg_xlog with useless data.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication slot and pg_rewind

2017-06-05 Thread Michael Paquier
On Tue, Jun 6, 2017 at 12:03 PM, Bhattacharyya, Subhro
 wrote:
> We are using the replication slot and pg_rewind feature of postgresql 9.6
> Our cluster consists of 1 master and 1 slave node.
>
> The replication slot feature allows the master to keep as much WAL as is
> required by the slave.
>
> The pg_rewind command uses WALs to bring the slave in sync with the master.
> By using replication slots there are always enough WAL in the pg_xlog.
>
> In this case is it safe to use pg_rewind without WAL archiving?
> Can there be a situation where pg_rewind fails?

When pg_rewind runs it looks at the WAL from the last checkpoint
before WAL diverged on the *target* node, not the source. So retaining
the WAL data on the primary after the standby has been promoted makes
little sense from this point of view. Even worse, once the promoted
standby decides to recycle the past WAL segments you won't be able to
do a rewind of the previous primary because there is no way to know
what are the blocks modified on the standby since the point of
divergence.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_rewind - enable wal_log_hints or data-checksums

2017-06-04 Thread Michael Paquier
On Mon, Jun 5, 2017 at 9:37 AM, Dylan Luong  wrote:
> pg_rewind requires that the target server either has the wal_log_hints
> option enabled in postgresql.conf or data checksums enabled when the cluster
> was initialized with initdb.

Yes, this is to make sure that you don't finish with a corrupted
target server if a hint bit is set on a page after a checkpoint. Any
of those options make sure that a full-page write is generated in this
case.

> What is the difference between the two options?

Data checksums calculate 2 bytes of checksum data and write it to each
page that is evicted from shared buffers. Each page read from disk has
its checksum checked. In some workloads, like a heavy read load where
a lot of page evictions happen, this can induce a couple of percents
of performance lost. In my own experience, that's 1~2%.

> What are the advantages and disadvantages between the two?
> Which one is the the preferred option?

If you care more about performance or if you use a file system that
has its own block-level checksum, wal_log_hints would be preferred.
Data checksums offer more guarantees in terms of integrity though when
looking for corrupted data. Things get found more quickly.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] wal_retrieve_retry_interval

2017-05-29 Thread Michael Paquier
On Sun, May 28, 2017 at 9:41 AM, Ludovic Vaugeois-Pepin
 wrote:
> On Sun, May 28, 2017 at 9:49 AM, Ludovic Vaugeois-Pepin
>  wrote:
>> It really is set at 3s on all servers (master and standbies) earlier in the 
>> "deployment" process at the same time "listen_addresses", "hot_standby", and 
>> others are set. This doesn't seem to happen every time I run tests. I 
>> increased logging to DEBUG1. This is what I got when the problem occurred 
>> again. Note that the empty line is there in the log file.

Aren't those logs coming from a standby that is being reconnected to a
promoted standby? In short you restarted this standby with a fresh
recovery.conf. The WAL receiver would try to fetch history file data
for validity checks when it notices that there is a timeline jump,
when it needs to itself jump to a new timeline, and the logs you are
showing highlight that.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] wal_retrieve_retry_interval

2017-05-27 Thread Michael Paquier
On Sat, May 27, 2017 at 2:40 PM, Ludovic Vaugeois-Pepin
 wrote:
> Say, with 9.6.2,  a hot_standby fails to connect to a replication slot:
> FATAL:  could not start WAL streaming: ERROR:  replication slot "test3"
> does not exist
> or
> FATAL:  could not connect to the primary server: FATAL:  the database
> system is starting up
>
> Is there a way to reduce the time it takes until the next attempt? I
> assumed, wrongly I think, that this would be wal_retrieve_retry_interval,
> but it seems that it won't make a difference. I tried setting it to 3s, but
> it seems to take 15s still. Here are two log samples:

Could you double-check your configuration? If I set
wal_retrieve_retry_interval to 1s on a standby, I am able to see a
connection attempt from a WAL receiver happening with this interval of
time in the case of repetitive failures.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Re: Is there possibility btree_redo with XLOG_BTREE_DELETE done between standby_redo and the end of backup

2017-05-25 Thread Michael Paquier
On Thu, May 25, 2017 at 5:23 AM, y39chen  wrote:
> My doubt is Standby is redoing the records in WAL from master. how accept
> connection in standby side while recovering would trigger
> btree_xlog_delete_get_latestRemovedXid() and panic happen.

You should look at the relationship between the code of postmaster.c
dealing with updates of pmState and how the startup process (xlog.c)
lets the postmaster know when it can accept incoming connections. Once
a set of conditions is met, the startup process will let the
postmaster know if it is safe to accept connections on a hot standby.
That's a good study and the code is well-commented, so I let you guess
what are those conditions and how they are met during recovery.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is there possibility btree_redo with XLOG_BTREE_DELETE done between standby_redo and the end of backup

2017-05-24 Thread Michael Paquier
On Wed, May 24, 2017 at 11:47 AM, Tom Lane  wrote:
> Wup, sorry, I quoted the wrong thread mentioning
> btree_xlog_delete_get_latestRemovedXid.  The right one is
>
> https://www.postgresql.org/message-id/flat/00F0B2CEF6D0CEF8A90119D4%40eje.credativ.lan
>
> and the patch is
>
> Author: Tom Lane 
> Branch: master [0e0f43d6f] 2016-08-31 08:52:13 -0400
> Branch: REL9_6_STABLE Release: REL9_6_0 [3fc489cb3] 2016-08-31 08:52:13 -0400
> Branch: REL9_5_STABLE Release: REL9_5_5 [c40bb1155] 2016-08-31 08:52:13 -0400
> Branch: REL9_4_STABLE Release: REL9_4_10 [f4e40537e] 2016-08-31 08:52:13 -0400
> Branch: REL9_3_STABLE Release: REL9_3_15 [baf111d31] 2016-08-31 08:52:13 -0400
> Branch: REL9_2_STABLE Release: REL9_2_19 [823df401d] 2016-08-31 08:52:13 -0400
> Branch: REL9_1_STABLE Release: REL9_1_24 [e3439a455] 2016-08-31 08:52:13 -0400
>
> Prevent starting a standalone backend with standby_mode on.

Ahh. This one... Thanks. That may be related, but no way to say if
that's related with this report as there is no information that a
standalone backend is used. Perhaps that's the case though. So it
could be possible as well that the set of custom patches mentioned is
fishy.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is there possibility btree_redo with XLOG_BTREE_DELETE done between standby_redo and the end of backup

2017-05-24 Thread Michael Paquier
On Wed, May 24, 2017 at 11:18 AM, Tom Lane  wrote:
> y39chen  writes:
>> is there possibility btree_redo with XLOG_BTREE_DELETE info done between
>> standby_redo and  the end of backup? I have PostgreSQL 9.3.14 which have
>> some patches to use and easy to happen.
>> ...
>> //inconsistent detected in btree_xlog_delete_get_latestRemovedXid()
>> [26005-59251d35.6595-726087] 2017-05-24 05:42:22.513 GMT < > WARNING:
>> 01000: btree_xlog_delete_get_latestRemovedXid: cannot operate with
>> inconsistent data
>
> This looks a whole lot like a bug we fixed in 9.3.15, cf
>
> https://www.postgresql.org/message-id/flat/DB5PR07MB15416C65687A1EA9AC0D26F8D6E00%40DB5PR07MB1541.eurprd07.prod.outlook.com

To which commit are you referring here? I don't recall any fixes
related to that, and there is nothing in the git history indicating
so.

Still I don't understand how this PANIC code can be reachable with
community code. CountDBBackends() will normally return 0 if
consistency is not reached, and postmaster will block incoming
connections until a consistent state is reached.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pgAdmin4 needs information of v10 SCRAM authentication

2017-04-24 Thread Michael Paquier
On Mon, Apr 24, 2017 at 3:04 PM, Akshay Joshi
 wrote:
>I have gone through this, but still facing issue to encrypt/change the 
> database server password. In pgAdmin4 we have "Change Password" feature where 
> user will enter the old and new password for the database server, now we will 
> have to encrypt it (in Python) as per SCRAM standards and set it to the 
> database.

By using SET password_encryption = 'scram-sha-256' and sending the raw
password you would be able to hash the password correctly. Or you
could just mimic scram_build_password() (routine in Postgres code to
generate that correctly).

>   The example you have given in 
> https://www.postgresql.org/message-id/76ac7e67-4e3a-f4df-e087-fbac90151...@iki.fi
>  I have below questions:
> To encode the password you already have entry from pg_authid table which 
> won't be possible for non superuser to access that table. How we can get that 
> value from pg_authid table or do we have any other solution to this.

I don't understand this question, any user can update this field using
CREATE/ALTER ROLE, and the client has no need to know this value for
the exchange.

> For constructing  the whole client-final-message, we need to calculate  
> ClientSignature and ClientProof, which depend on the nonces, and is  
> therefore different on every authentication exchange. How to calculate 
> ClientSignature and ClientProof?
>   Can you please guide me here, how can we achieve that in python.

You will need a C equivalent of what is proposed in fe-auth-scram.c in
the Postgres code to build the messages that are exchanged from the
server, see particularly calculate_client_proof() which describes step
by step the calculation of the client proof when building the last
message for the client. I didn't check in details, but the routines
are the same as in the message above. The format of the hashed
password has changed a bit since commit 68e61ee though.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_basebackup ----xlog-method=stream

2017-04-21 Thread Michael Paquier
On Sat, Apr 22, 2017 at 3:02 AM, Ian Harding  wrote:
> Am I misunderstanding how this works?  I have WAL archiving set up, so the
> files are available, but I wanted them included in the backup.

Please note that if you have a WAL archive available, you may not even
need to have --xlog-method=stream, which is useful to have fully
safe-contailed backups. For the error regarding the missing WAL
segment, it could happen if Postgres completes two checkpoints when
taking the backup. One thing that you could use is a replication slot
that gives the guarantee that segments are retained. Those have been
integrated in pg_basebackup with 9.6.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pgAdmin4 needs information of v10 SCRAM authentication

2017-04-21 Thread Michael Paquier
On Fri, Apr 21, 2017 at 3:43 PM, Akshay Joshi
 wrote:
>Thanks Michael, will check this.

One thing I forgot to mention... Both StoredKey and ServerKey are now
encoded in hex, but there is still an open item related to the
handling of psql's \password on which I have written a patch to switch
their encoding to base64 for simplicity. Not sure what is Heikki's
take on the matter, but I would recommend to be careful about that. My
last set of patches is here:
https://www.postgresql.org/message-id/CAB7nPqSbsCBCxy8-DtwzRxYgTnbGUtY4uFEkLQhG=R=uo=g...@mail.gmail.com
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pgAdmin4 needs information of v10 SCRAM authentication

2017-04-21 Thread Michael Paquier
On Fri, Apr 21, 2017 at 3:27 PM, Akshay Joshi
 wrote:
> from passlib.hash import scram
> hash = scram.encrypt(data['newPassword']) -- This function provide password 
> for all the supported digest like [md5, sha-1, sha-256, sha-512]. Didn't work 
> I have tried with all the passwords.
> test = scram.extract_digest_info(hash, "sha-256") -- This function extract 
> info for specified digest "sha-256". I have retrieve the password which was 
> in hexadecimal. Didn't work as well.
>
> Now I am stuck here and no clue how to encrypt/decrypt the password for SCRAM 
> authentication. Can someone guide me out here.

Here you go:
https://www.postgresql.org/message-id/76ac7e67-4e3a-f4df-e087-fbac90151...@iki.fi
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] financial formulae

2017-04-13 Thread Michael Paquier
On Thu, Apr 13, 2017 at 8:44 PM, FarjadFarid(ChkNet)
 wrote:
> Is there a package/library, for postgresql server, covering financial
> formula equivalent to excel financial formulae?
>
> Especially considering the quality and precision.

You will need to be more precise regarding what you are looking for.
PostgreSQL integrates in-core many aggregate functions that work with
different level of precisions depending on the data type (see for
example functions for float, numeric, etc.), so my gut feeling is that
what you are looking for is present.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Suggestion to improve select pg_reload_conf()

2017-04-03 Thread Michael Paquier
On Mon, Apr 3, 2017 at 4:39 PM, Achilleas Mantzios
 wrote:
> On 03/04/2017 10:31, Thomas Kellerer wrote:
>>
>> I would like to suggest an improvement to the select pg_reload_conf()
>> function.
>>
>> Currently this will only return true or false indicating if reloading was
>> successful.
>>
>> I think it would be a "nice-to-have" if the function would also return the
>> GUCs that have been changed, similar to what is being written to the
>> logfile.
>>
>> To not break existing code (e.g. scripts that only expect true/false),
>> this could be done through an optional boolean parameter (e.g. named
>> "verbose").
>
> To my understanding pg_reload_conf just SIGHUP's the postmaster. So it might
> not be so trivial.

Yup. But there is a workaround possible at SQL-level to know what are
the parameters that would be updated on SIGHUP by comparing the values
in pg_file_settings that are the values found in the configuration
files with the current settings applied in pg_settings.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Index loading methods

2017-03-27 Thread Michael Paquier
On Mon, Mar 27, 2017 at 8:36 PM, pinker  wrote:
> If PostgreSQL decides to use an index, does he every time load the whole
> B-tree into memory? or maybe loads only specific subtree or some chunks of
> index?

src/backend/access/nbtree/README provides details about the algorithm
of Lehman and Yao used in btree indexes. In short, backends share
buffer pages, and those will stay around if they are hot enough. If
there is a high page eviction for those btree pages, you may finish by
needing to reload the tree, but with a proper tuning (enough memory)
that won't happen.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-03-26 Thread Michael Paquier
On Sun, Mar 26, 2017 at 2:50 PM, John R Pierce  wrote:
> On 3/25/2017 10:31 PM, George Neuner wrote:
>>
>> I also dislike having my email full of list posts.
>
> I have a 'postgres' folder in Thunderbird, and all posts "To:" or CC:
> pgsql-*@postgresql.org get moved there, that way they are together, and
> don't clutter my other email.I have similar folders for the various
> other email lists I'm on.

If you have subscribed to more mailing lists than -general, having one
subfolder per list can also help a lot, grouping as well some of those
having a low activity, for example:
- one folder for -hackers and -hackers-cluster.
- one folder for -general.
- one folder for -jdbc and -odbc.
- one for -bugs and -docs.
- one for -jobs and -announce, etc.
Something like that will make your hacking activity way easier to
handle. I would bet that a lot of people around here do that.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


  1   2   3   4   5   6   >