Re: Does PostgreSQL check database integrity at startup?

2017-12-27 Thread Alvaro Herrera
For context: this was first reported in the Barman forum here:
https://groups.google.com/forum/#!msg/pgbarman/3aXWpaKWRFI/weUIZxspDAAJ
They are using Barman for the backups.

Stephen Frost wrote:

> > But at some point in time, slave became corrupt (one of the base
> > files are zero size where it should be 16Mb in size), and IMHO a
> > "red alert" should arise - Slave server shall not even startup at
> > all.
> 
> How do you know it should be 16Mb in size...?  That sounds like you're
> describing a WAL file, but you should be archiving your WAL files during
> a backup, not just using whatever is in pg_xlog/pg_wal..

It's not a WAL file -- it's a file backing a table.

> > Since backups are taken from slave server, all backups are also corrupt.
> 
> If you aren't following the appropriate process to perform a backup
> then, yes, you're going to end up with corrupt and useless/bad backups.

A few guys went over the backup-taking protocol upthread already.

But anyway the backup tool is a moot point.  The problem doesn't
originate in the backup -- it originates in the standby, from where the
backup is taken.  The file can be seen as size 0 in the standby.
Edson's question is: why wasn't the problem detected in the standby?
It seems a valid question to me, to which we currently we don't have any
good answer.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Does PostgreSQL check database integrity at startup?

2017-12-27 Thread Stephen Frost
Edson,

* Edson Carlos Ericksson Richter (rich...@simkorp.com.br) wrote:
> Em 27/12/2017 15:02, Stephen Frost escreveu:
> >* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> >>Stephen Frost wrote:
> >>
> >>>* Edson Carlos Ericksson Richter (rich...@simkorp.com.br) wrote:
> Anyway, instead digging into rsync functionality (or bugs - I doubt,
> but who knows?), I do prefer to have a script I can run to check if
> there is obvious failures in standby servers.
> >>>As mentioned, zero-byte files can be perfectly valid.  PostgreSQL does
> >>>have page-level CRCs, if you initialized your database with them (which
> >>>I would strongly recommend).
> >>Page-level checksums would not detect the problem being complained in
> >>this thread, however.
> >It's entirely unclear to me what the problem being complained about in
> >this thread actually is.  The complaint so far was about zero-byte
> >files, but those are entirely valid, so that isn't a problem that anyone
> >can solve..
> >
> >Given the thread subject, if someone actually wanted to do a database
> >integrity check before startup, they could use pgBackRest to perform a
> >backup with a CRC-enabled database and at least verify that all of the
> >checksums are valid.
> >
> >We could possibly look into adding some set of additional checks for
> >files which can't actually be zero-byte, perhaps..  I know we have some
> >other one-off checks already.
> 
> Actually, the problem is:
> 
> Master => Slave => Backup
> 
> In the master server everything is fine.

Alright..

> But at some point in time, slave became corrupt (one of the base
> files are zero size where it should be 16Mb in size), and IMHO a
> "red alert" should arise - Slave server shall not even startup at
> all.

How do you know it should be 16Mb in size...?  That sounds like you're
describing a WAL file, but you should be archiving your WAL files during
a backup, not just using whatever is in pg_xlog/pg_wal..

> Since backups are taken from slave server, all backups are also corrupt.

If you aren't following the appropriate process to perform a backup
then, yes, you're going to end up with corrupt and useless/bad backups.

Backing up from a replica has only been officially supported using the
pg_start/stop_backup methods as of 9.6 and only when doing a
non-exclusive backup.

Note that the wiki page you're talking about (I think, anyway...) is
describing *exclusive* backup, not non-exclusive, and the two are not
the same.

> I've detected the problem just because I've restored a backup
> (excellent practice perhaps - nobody should take backups if not
> testing it with the restore procedure).

Yes, restoring a backup is excellent practice and something that
everyone really should be doing.  In my view, at least, everyone should
also be using well tested backup tools instead of trying to write their
own.

> In slave server there is no indication that the database is corrupt
> (not in logs, it starts normally and show it is applying stream
> changes regularly).

This sounds like what's happening is that you're ending up with PG
thinking that a crash happened because backup_label is missing, which
will happen if you do pg_start/stop_backup on the replica and don't make
sure to take the results from pg_stop_backup and create the backup_label
file before starting PostgreSQL after the restore.

This isn't something that would happen if you used a backup tool that
knew about how to perform a non-exclusive backup or how to properly
backup using a replica instead of trying to write your own without
understanding how all these pieces play together.

> So that is the point: how to detect that a database is corrupt so
> cluster doesn't even start...

I'm not sure that the database is actually corrupt in this specific
case- it sounds like everything is actually fine, but you didn't include
the backup_label file when restoring and therefore PG thinks there is
crash recovery happening when it should be replaying WAL from the start
of the backup, but how is PG going to know that?  Well, it'd know that
from the backup_label file, if it was there..  Otherwise, it seems
pretty difficult for us to know that we're not doing crash recovery.
I'm certainly open to ideas on how to detect that, but nothing springs
to mind off-hand.

Again, using a backup tool instead of trying to roll your own would make
this much less likely to happen.  This isn't something which should
happen when you're using pg_basebackup or the other PG-specific backup
tools, rather it's only happening because you're trying to do your own
with pg_start/stop_backup and rsync and didn't completely read the
documentation on non-exclusive backups from replicas (that said, those
docs could certainly use improvment...).

Of course, perhaps I'm misunderstanding exactly what you're doing or
what file you're referring to, but this is my best guess based on the
information you've provided so far.

Thanks!

Stephen


signature.asc
Description: Digital 

Re: postgresql-10 for ubuntu-17.10 (artful)?

2017-12-27 Thread Stuart McGraw
On 12/27/2017 11:07 AM, Steve Atkins wrote:
>> On Dec 26, 2017, at 6:21 PM, Stuart McGraw  wrote:
>> Is there a repository for Postgresql-10 available at 
>> http://apt.postgresql.org/pub/repos/apt for Ubuntu-17.10 (artful)?
>> When I look at the dist/ subdirectory there, there seem to be repos
>> for all the other Ubuntu releases including an upcoming one
>> (bionic) but not artful.
>> 
>> Am I looking in the wrong place?  (I am new to Ubuntu and Debian
>> packaging.)
> 
> I think that LTS releases (14.04, 16.04, 18.04) are the main releases
> targeted, with intermediate releases only added if they're
> incompatible with the previous LTS release.
> 
> See the FAQ at https://wiki.postgresql.org/wiki/Apt/FAQ  - I think
> you should be able to use the 17.04 package on 17.10.
> 
> Cheers, Steve

Thanks.  I added the 17.04 PGDG repository:
  deb http://apt.postgresql.org/pub/repos/apt/ zesty-pgdg main
and postgresql-10 and friends installed ok.  But later installing
a different package:

   Some packages could not be installed. This may mean that you have
   requested an impossible situation or if you are using the unstable
   distribution that some required packages have not yet been created
   or been moved out of Incoming.
   The following information may help to resolve the situation:
 The following packages have unmet dependencies:
   postgresql-autodoc : Depends: libdbd-pg-perl but it is not going
   to be installed
   E: Unable to correct problems, you have held broken packages.
 
further,

   # apt-get install libdbd-pg-perl
   [... same message as from ansible except: ]
 The following packages have unmet dependencies:
   libdbd-pg-perl : Depends: perlapi-5.24.1 but it is not installable
   # apt-get install perlapi
   E: Unable to locate package perlapi

   # apt show -a 'libdbd-pg-perl'
   [...output abbreviated...]
   Package: libdbd-pg-perl
   Version: 3.7.0-1~pgdg17.04+1
   Depends: perl (>= 5.24.1-2ubuntu1), perl-dbdabi-94, perlapi-5.24.1,
 libc6 (>= 2.14), libpq5 (>= 8.4~), libdbi-perl (>= 1.614)

   Package: libdbd-pg-perl
   Version: 3.6.2-2build1
   Depends: perl (>= 5.26.0-4), perl-dbdabi-94, perlapi-5.26.0,
 libc6 (>= 2.14), libpq5 (>= 8.4~), libdbi-perl (>= 1.614)

As I said, I am new to Ubuntu packaging so whether the problem 
is something I need to research elsewhere and fix, or if it is 
an actual problem with the PGDG repository, is not clear to me.

(Yes, I realize 
  https://www.postgresql.org/download/linux/ubuntu/
says only LTS releases of Ubuntu are "fully" supported, but if 
it just takes a minor fix to make things work with 17.10...)



Re: Does PostgreSQL check database integrity at startup?

2017-12-27 Thread Stephen Frost
Alvaro,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:
> Stephen Frost wrote:
> 
> > It's entirely unclear to me what the problem being complained about in
> > this thread actually is.
> 
> As Edson explained, a relfilenode in the standby server is zero bytes
> long when it is not that size in the primary server, and it corresponds
> to a persistent table.  I don't have any satisfactory explanation for
> that.

Somehow I missed the discussion of it being zero bytes for a long time
after it's non-zero on the primary (I take the to main that it's longer
than whatever checkpoint timeout is set to..?).  That does sound like
there might be an actual issue/bug here.

> > Given the thread subject, if someone actually wanted to do a database
> > integrity check before startup, they could use pgBackRest to perform a
> > backup with a CRC-enabled database and at least verify that all of the
> > checksums are valid.
> 
> That's not a complete solution, because a zero-byte file does not
> contain any CRC.  CRCs may detect some problems, but they will not
> detect this particular kind of corruption.

I agree that we can certainly have bugs which will cause corruption
that's not detectable by our CRCs.  Validating CRCs is just checking for
one kind of corruption; using amcheck would provide another level,
though it's unclear if that would help in this specific case if it's
really just files on the standby (but not the primary) being zero-length
long after they should be written out.

> > We could possibly look into adding some set of additional checks for
> > files which can't actually be zero-byte, perhaps..  I know we have some
> > other one-off checks already.
> 
> Some possibilities, from the realm of haven't-had-coffee-yet-after-lunch:
> 
> * the length of the FSM fork can let you infer something about the
> length that the main fork ought to have.  Maybe the VM fork too? not sure.
> (Easy to check: just some math on the size of the FSM/VM forks)

I was just chatting with David over lunch specifically about checking
that the number of forks was valid and looking at their lengths also
makes sense, but the VM is only created/updated based on VACUUM runs, so
checking that is a bit tricky.  David has ideas about how to handle
various races when it comes to forks (in particular by building a
manifest once and then seeing if anything in the relation has changed
since the manifest was built- if so, assume that WAL has the necessary
info and if not then raise a flag) but it's all a bit tricky when the
system is running.  We might be able to do more checks if we're serious
about supporting pre-startup checks though.  We were also thinking about
having a set of catalog tables that must not be zero and checking for
those.

> * the largest block number in any item pointer in any index of a table
> can tell you what's the latest page that should appear in the table.
> (Expensive: need to scan the indexes completely)

This is something that I'd expect amcheck to handle; I'm not sure it
makes sense to invent something else.  David and I have chatted a little
bit a while ago about trying to incorporate amcheck but it's rather
complicated and not something we plan to do in the near-term.  Still, it
does seem like it'd be a nice capability to have.  I wish it was
possible to do off-line though.  Of course, a user could do a restore
and then run amcheck on the result themselves.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: postgresql-10 for ubuntu-17.10 (artful)?

2017-12-27 Thread Steve Atkins

> On Dec 26, 2017, at 6:21 PM, Stuart McGraw  wrote:
> 
> Is there a repository for Postgresql-10 available at 
>  http://apt.postgresql.org/pub/repos/apt
> for Ubuntu-17.10 (artful)?  When I look at the dist/
> subdirectory there, there seem to be repos for all the
> other Ubuntu releases including an upcoming one (bionic) 
> but not artful.
> 
> Am I looking in the wrong place?  (I am new to Ubuntu 
> and Debian packaging.)
> 
> 

I think that LTS releases (14.04, 16.04, 18.04) are the main releases targeted, 
with intermediate releases only added if they're incompatible with the previous 
LTS release.

See the FAQ at https://wiki.postgresql.org/wiki/Apt/FAQ  - I think you should 
be able to use the 17.04 package on 17.10.

Cheers,
  Steve




Re: Does PostgreSQL check database integrity at startup?

2017-12-27 Thread Edson Carlos Ericksson Richter

Em 27/12/2017 15:02, Stephen Frost escreveu:

Alvaro,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:

Stephen Frost wrote:


* Edson Carlos Ericksson Richter (rich...@simkorp.com.br) wrote:

Anyway, instead digging into rsync functionality (or bugs - I doubt,
but who knows?), I do prefer to have a script I can run to check if
there is obvious failures in standby servers.

As mentioned, zero-byte files can be perfectly valid.  PostgreSQL does
have page-level CRCs, if you initialized your database with them (which
I would strongly recommend).

Page-level checksums would not detect the problem being complained in
this thread, however.

It's entirely unclear to me what the problem being complained about in
this thread actually is.  The complaint so far was about zero-byte
files, but those are entirely valid, so that isn't a problem that anyone
can solve..

Given the thread subject, if someone actually wanted to do a database
integrity check before startup, they could use pgBackRest to perform a
backup with a CRC-enabled database and at least verify that all of the
checksums are valid.

We could possibly look into adding some set of additional checks for
files which can't actually be zero-byte, perhaps..  I know we have some
other one-off checks already.

Thanks!

Stephen


Actually, the problem is:

Master => Slave => Backup

In the master server everything is fine.
But at some point in time, slave became corrupt (one of the base files 
are zero size where it should be 16Mb in size), and IMHO a "red alert" 
should arise - Slave server shall not even startup at all.


Since backups are taken from slave server, all backups are also corrupt.

I've detected the problem just because I've restored a backup (excellent 
practice perhaps - nobody should take backups if not testing it with the 
restore procedure).


In slave server there is no indication that the database is corrupt (not 
in logs, it starts normally and show it is applying stream changes 
regularly).


So that is the point: how to detect that a database is corrupt so 
cluster doesn't even start...


Regards,

Edson



Re: Does PostgreSQL check database integrity at startup?

2017-12-27 Thread Stephen Frost
Alvaro,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Stephen Frost wrote:
> 
> > * Edson Carlos Ericksson Richter (rich...@simkorp.com.br) wrote:
> 
> > > Anyway, instead digging into rsync functionality (or bugs - I doubt,
> > > but who knows?), I do prefer to have a script I can run to check if
> > > there is obvious failures in standby servers.
> > 
> > As mentioned, zero-byte files can be perfectly valid.  PostgreSQL does
> > have page-level CRCs, if you initialized your database with them (which
> > I would strongly recommend).
> 
> Page-level checksums would not detect the problem being complained in
> this thread, however.

It's entirely unclear to me what the problem being complained about in
this thread actually is.  The complaint so far was about zero-byte
files, but those are entirely valid, so that isn't a problem that anyone
can solve..

Given the thread subject, if someone actually wanted to do a database
integrity check before startup, they could use pgBackRest to perform a
backup with a CRC-enabled database and at least verify that all of the
checksums are valid.

We could possibly look into adding some set of additional checks for
files which can't actually be zero-byte, perhaps..  I know we have some
other one-off checks already.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: Does PostgreSQL check database integrity at startup?

2017-12-27 Thread Alvaro Herrera
Stephen Frost wrote:

> * Edson Carlos Ericksson Richter (rich...@simkorp.com.br) wrote:

> > Anyway, instead digging into rsync functionality (or bugs - I doubt,
> > but who knows?), I do prefer to have a script I can run to check if
> > there is obvious failures in standby servers.
> 
> As mentioned, zero-byte files can be perfectly valid.  PostgreSQL does
> have page-level CRCs, if you initialized your database with them (which
> I would strongly recommend).

Page-level checksums would not detect the problem being complained in
this thread, however.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: psycopg2 and java gssapi questions

2017-12-27 Thread Mike Feld
What about when the ticket expires? Are there any libraries that manage this 
for the application? Is this common practice by anyone?


Mike



-Original Message-
From: Dave Cramer 
To: Magnus Hagander 
Cc: Mike Feld ; pgsql-general 
Sent: Thu, Dec 21, 2017 6:09 am
Subject: Re: psycopg2 and java gssapi questions




On 21 December 2017 at 05:27, Magnus Hagander  wrote:





On Wed, Dec 20, 2017 at 8:42 PM, Mike Feld  wrote:

Is it possible to authenticate with Postgres from astandalone application using 
gssapi? In other words, I am able to authenticatewith Postgres when a human has 
logged in to either Windows or Linux andgenerated a ticket, but is it possible 
for say a Django site or Javaapplication running on some server somewhere to 
authenticate with Postgresusing gssapi? I realize that psycopg2 has a 
connection parameter for “krbsrvname”,but how does it generate a ticket? Is 
this the only alternative to secure authentication since Postgres does not 
support secure ldap (ldaps)?


Sure it is. 


libpq won't generate the initial ticket, though. The way to do it is to have 
your django or whatever application run "kinit" for the user before it starts. 
This will request a TGT, and the ticket will be present in that users 
environment, and will be used by the libpq client. (it might look slightly 
different for a Java client, but the principle is the same)







JDBC docs on GSSAPI can be found 
https://jdbc.postgresql.org/documentation/head/connect.html





Dave Cramer

da...@postgresintl.com
www.postgresintl.com










Re: Does PostgreSQL check database integrity at startup?

2017-12-27 Thread Edson Carlos Ericksson Richter

Em 26/12/2017 20:11, rob stone escreveu:

Hello,


On Tue, 2017-12-26 at 18:58 -0300, Alvaro Herrera wrote:Hello,

David Steele wrote:


pgBackRest will validate all page checksums (including indexes,
etc.) in the
cluster during backup.  Full backups check everything,
incr/differential
backups check only the files that have changed.

If a table or index file is of zero length when backed up, as in the
described case, nothing will be checked, right?  I mean, there is
nothing externally indicating that the file ought to be of a
different
size.  Am I wrong?  So Edson's situation here would not raise any red
flags.



Could the following occur:-
1) Your app. issues a BEGIN followed by an INSERT.
2) Postgres decides to open a new file in order to store the new row.
3) Your app. then does a ROLLBACK.

Wouldn't that leave you with a zero length file on disk?

There's no reason for Postgres to delete the file just because a
rollback was issued. All it has to do is clear the buffer in memory.

My 2 cents.

Rob




You are right, in several databases I've looked there are many files 
with 0 byte size in ./base folder.

This is not a good quest.

Regards,

Edson