Re: Does PostgreSQL check database integrity at startup?
For context: this was first reported in the Barman forum here: https://groups.google.com/forum/#!msg/pgbarman/3aXWpaKWRFI/weUIZxspDAAJ They are using Barman for the backups. Stephen Frost wrote: > > But at some point in time, slave became corrupt (one of the base > > files are zero size where it should be 16Mb in size), and IMHO a > > "red alert" should arise - Slave server shall not even startup at > > all. > > How do you know it should be 16Mb in size...? That sounds like you're > describing a WAL file, but you should be archiving your WAL files during > a backup, not just using whatever is in pg_xlog/pg_wal.. It's not a WAL file -- it's a file backing a table. > > Since backups are taken from slave server, all backups are also corrupt. > > If you aren't following the appropriate process to perform a backup > then, yes, you're going to end up with corrupt and useless/bad backups. A few guys went over the backup-taking protocol upthread already. But anyway the backup tool is a moot point. The problem doesn't originate in the backup -- it originates in the standby, from where the backup is taken. The file can be seen as size 0 in the standby. Edson's question is: why wasn't the problem detected in the standby? It seems a valid question to me, to which we currently we don't have any good answer. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Does PostgreSQL check database integrity at startup?
Edson, * Edson Carlos Ericksson Richter (rich...@simkorp.com.br) wrote: > Em 27/12/2017 15:02, Stephen Frost escreveu: > >* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: > >>Stephen Frost wrote: > >> > >>>* Edson Carlos Ericksson Richter (rich...@simkorp.com.br) wrote: > Anyway, instead digging into rsync functionality (or bugs - I doubt, > but who knows?), I do prefer to have a script I can run to check if > there is obvious failures in standby servers. > >>>As mentioned, zero-byte files can be perfectly valid. PostgreSQL does > >>>have page-level CRCs, if you initialized your database with them (which > >>>I would strongly recommend). > >>Page-level checksums would not detect the problem being complained in > >>this thread, however. > >It's entirely unclear to me what the problem being complained about in > >this thread actually is. The complaint so far was about zero-byte > >files, but those are entirely valid, so that isn't a problem that anyone > >can solve.. > > > >Given the thread subject, if someone actually wanted to do a database > >integrity check before startup, they could use pgBackRest to perform a > >backup with a CRC-enabled database and at least verify that all of the > >checksums are valid. > > > >We could possibly look into adding some set of additional checks for > >files which can't actually be zero-byte, perhaps.. I know we have some > >other one-off checks already. > > Actually, the problem is: > > Master => Slave => Backup > > In the master server everything is fine. Alright.. > But at some point in time, slave became corrupt (one of the base > files are zero size where it should be 16Mb in size), and IMHO a > "red alert" should arise - Slave server shall not even startup at > all. How do you know it should be 16Mb in size...? That sounds like you're describing a WAL file, but you should be archiving your WAL files during a backup, not just using whatever is in pg_xlog/pg_wal.. > Since backups are taken from slave server, all backups are also corrupt. If you aren't following the appropriate process to perform a backup then, yes, you're going to end up with corrupt and useless/bad backups. Backing up from a replica has only been officially supported using the pg_start/stop_backup methods as of 9.6 and only when doing a non-exclusive backup. Note that the wiki page you're talking about (I think, anyway...) is describing *exclusive* backup, not non-exclusive, and the two are not the same. > I've detected the problem just because I've restored a backup > (excellent practice perhaps - nobody should take backups if not > testing it with the restore procedure). Yes, restoring a backup is excellent practice and something that everyone really should be doing. In my view, at least, everyone should also be using well tested backup tools instead of trying to write their own. > In slave server there is no indication that the database is corrupt > (not in logs, it starts normally and show it is applying stream > changes regularly). This sounds like what's happening is that you're ending up with PG thinking that a crash happened because backup_label is missing, which will happen if you do pg_start/stop_backup on the replica and don't make sure to take the results from pg_stop_backup and create the backup_label file before starting PostgreSQL after the restore. This isn't something that would happen if you used a backup tool that knew about how to perform a non-exclusive backup or how to properly backup using a replica instead of trying to write your own without understanding how all these pieces play together. > So that is the point: how to detect that a database is corrupt so > cluster doesn't even start... I'm not sure that the database is actually corrupt in this specific case- it sounds like everything is actually fine, but you didn't include the backup_label file when restoring and therefore PG thinks there is crash recovery happening when it should be replaying WAL from the start of the backup, but how is PG going to know that? Well, it'd know that from the backup_label file, if it was there.. Otherwise, it seems pretty difficult for us to know that we're not doing crash recovery. I'm certainly open to ideas on how to detect that, but nothing springs to mind off-hand. Again, using a backup tool instead of trying to roll your own would make this much less likely to happen. This isn't something which should happen when you're using pg_basebackup or the other PG-specific backup tools, rather it's only happening because you're trying to do your own with pg_start/stop_backup and rsync and didn't completely read the documentation on non-exclusive backups from replicas (that said, those docs could certainly use improvment...). Of course, perhaps I'm misunderstanding exactly what you're doing or what file you're referring to, but this is my best guess based on the information you've provided so far. Thanks! Stephen signature.asc Description: Digital
Re: postgresql-10 for ubuntu-17.10 (artful)?
On 12/27/2017 11:07 AM, Steve Atkins wrote: >> On Dec 26, 2017, at 6:21 PM, Stuart McGrawwrote: >> Is there a repository for Postgresql-10 available at >> http://apt.postgresql.org/pub/repos/apt for Ubuntu-17.10 (artful)? >> When I look at the dist/ subdirectory there, there seem to be repos >> for all the other Ubuntu releases including an upcoming one >> (bionic) but not artful. >> >> Am I looking in the wrong place? (I am new to Ubuntu and Debian >> packaging.) > > I think that LTS releases (14.04, 16.04, 18.04) are the main releases > targeted, with intermediate releases only added if they're > incompatible with the previous LTS release. > > See the FAQ at https://wiki.postgresql.org/wiki/Apt/FAQ - I think > you should be able to use the 17.04 package on 17.10. > > Cheers, Steve Thanks. I added the 17.04 PGDG repository: deb http://apt.postgresql.org/pub/repos/apt/ zesty-pgdg main and postgresql-10 and friends installed ok. But later installing a different package: Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: postgresql-autodoc : Depends: libdbd-pg-perl but it is not going to be installed E: Unable to correct problems, you have held broken packages. further, # apt-get install libdbd-pg-perl [... same message as from ansible except: ] The following packages have unmet dependencies: libdbd-pg-perl : Depends: perlapi-5.24.1 but it is not installable # apt-get install perlapi E: Unable to locate package perlapi # apt show -a 'libdbd-pg-perl' [...output abbreviated...] Package: libdbd-pg-perl Version: 3.7.0-1~pgdg17.04+1 Depends: perl (>= 5.24.1-2ubuntu1), perl-dbdabi-94, perlapi-5.24.1, libc6 (>= 2.14), libpq5 (>= 8.4~), libdbi-perl (>= 1.614) Package: libdbd-pg-perl Version: 3.6.2-2build1 Depends: perl (>= 5.26.0-4), perl-dbdabi-94, perlapi-5.26.0, libc6 (>= 2.14), libpq5 (>= 8.4~), libdbi-perl (>= 1.614) As I said, I am new to Ubuntu packaging so whether the problem is something I need to research elsewhere and fix, or if it is an actual problem with the PGDG repository, is not clear to me. (Yes, I realize https://www.postgresql.org/download/linux/ubuntu/ says only LTS releases of Ubuntu are "fully" supported, but if it just takes a minor fix to make things work with 17.10...)
Re: Does PostgreSQL check database integrity at startup?
Alvaro, * Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote: > Stephen Frost wrote: > > > It's entirely unclear to me what the problem being complained about in > > this thread actually is. > > As Edson explained, a relfilenode in the standby server is zero bytes > long when it is not that size in the primary server, and it corresponds > to a persistent table. I don't have any satisfactory explanation for > that. Somehow I missed the discussion of it being zero bytes for a long time after it's non-zero on the primary (I take the to main that it's longer than whatever checkpoint timeout is set to..?). That does sound like there might be an actual issue/bug here. > > Given the thread subject, if someone actually wanted to do a database > > integrity check before startup, they could use pgBackRest to perform a > > backup with a CRC-enabled database and at least verify that all of the > > checksums are valid. > > That's not a complete solution, because a zero-byte file does not > contain any CRC. CRCs may detect some problems, but they will not > detect this particular kind of corruption. I agree that we can certainly have bugs which will cause corruption that's not detectable by our CRCs. Validating CRCs is just checking for one kind of corruption; using amcheck would provide another level, though it's unclear if that would help in this specific case if it's really just files on the standby (but not the primary) being zero-length long after they should be written out. > > We could possibly look into adding some set of additional checks for > > files which can't actually be zero-byte, perhaps.. I know we have some > > other one-off checks already. > > Some possibilities, from the realm of haven't-had-coffee-yet-after-lunch: > > * the length of the FSM fork can let you infer something about the > length that the main fork ought to have. Maybe the VM fork too? not sure. > (Easy to check: just some math on the size of the FSM/VM forks) I was just chatting with David over lunch specifically about checking that the number of forks was valid and looking at their lengths also makes sense, but the VM is only created/updated based on VACUUM runs, so checking that is a bit tricky. David has ideas about how to handle various races when it comes to forks (in particular by building a manifest once and then seeing if anything in the relation has changed since the manifest was built- if so, assume that WAL has the necessary info and if not then raise a flag) but it's all a bit tricky when the system is running. We might be able to do more checks if we're serious about supporting pre-startup checks though. We were also thinking about having a set of catalog tables that must not be zero and checking for those. > * the largest block number in any item pointer in any index of a table > can tell you what's the latest page that should appear in the table. > (Expensive: need to scan the indexes completely) This is something that I'd expect amcheck to handle; I'm not sure it makes sense to invent something else. David and I have chatted a little bit a while ago about trying to incorporate amcheck but it's rather complicated and not something we plan to do in the near-term. Still, it does seem like it'd be a nice capability to have. I wish it was possible to do off-line though. Of course, a user could do a restore and then run amcheck on the result themselves. Thanks! Stephen signature.asc Description: Digital signature
Re: postgresql-10 for ubuntu-17.10 (artful)?
> On Dec 26, 2017, at 6:21 PM, Stuart McGrawwrote: > > Is there a repository for Postgresql-10 available at > http://apt.postgresql.org/pub/repos/apt > for Ubuntu-17.10 (artful)? When I look at the dist/ > subdirectory there, there seem to be repos for all the > other Ubuntu releases including an upcoming one (bionic) > but not artful. > > Am I looking in the wrong place? (I am new to Ubuntu > and Debian packaging.) > > I think that LTS releases (14.04, 16.04, 18.04) are the main releases targeted, with intermediate releases only added if they're incompatible with the previous LTS release. See the FAQ at https://wiki.postgresql.org/wiki/Apt/FAQ - I think you should be able to use the 17.04 package on 17.10. Cheers, Steve
Re: Does PostgreSQL check database integrity at startup?
Em 27/12/2017 15:02, Stephen Frost escreveu: Alvaro, * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: Stephen Frost wrote: * Edson Carlos Ericksson Richter (rich...@simkorp.com.br) wrote: Anyway, instead digging into rsync functionality (or bugs - I doubt, but who knows?), I do prefer to have a script I can run to check if there is obvious failures in standby servers. As mentioned, zero-byte files can be perfectly valid. PostgreSQL does have page-level CRCs, if you initialized your database with them (which I would strongly recommend). Page-level checksums would not detect the problem being complained in this thread, however. It's entirely unclear to me what the problem being complained about in this thread actually is. The complaint so far was about zero-byte files, but those are entirely valid, so that isn't a problem that anyone can solve.. Given the thread subject, if someone actually wanted to do a database integrity check before startup, they could use pgBackRest to perform a backup with a CRC-enabled database and at least verify that all of the checksums are valid. We could possibly look into adding some set of additional checks for files which can't actually be zero-byte, perhaps.. I know we have some other one-off checks already. Thanks! Stephen Actually, the problem is: Master => Slave => Backup In the master server everything is fine. But at some point in time, slave became corrupt (one of the base files are zero size where it should be 16Mb in size), and IMHO a "red alert" should arise - Slave server shall not even startup at all. Since backups are taken from slave server, all backups are also corrupt. I've detected the problem just because I've restored a backup (excellent practice perhaps - nobody should take backups if not testing it with the restore procedure). In slave server there is no indication that the database is corrupt (not in logs, it starts normally and show it is applying stream changes regularly). So that is the point: how to detect that a database is corrupt so cluster doesn't even start... Regards, Edson
Re: Does PostgreSQL check database integrity at startup?
Alvaro, * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: > Stephen Frost wrote: > > > * Edson Carlos Ericksson Richter (rich...@simkorp.com.br) wrote: > > > > Anyway, instead digging into rsync functionality (or bugs - I doubt, > > > but who knows?), I do prefer to have a script I can run to check if > > > there is obvious failures in standby servers. > > > > As mentioned, zero-byte files can be perfectly valid. PostgreSQL does > > have page-level CRCs, if you initialized your database with them (which > > I would strongly recommend). > > Page-level checksums would not detect the problem being complained in > this thread, however. It's entirely unclear to me what the problem being complained about in this thread actually is. The complaint so far was about zero-byte files, but those are entirely valid, so that isn't a problem that anyone can solve.. Given the thread subject, if someone actually wanted to do a database integrity check before startup, they could use pgBackRest to perform a backup with a CRC-enabled database and at least verify that all of the checksums are valid. We could possibly look into adding some set of additional checks for files which can't actually be zero-byte, perhaps.. I know we have some other one-off checks already. Thanks! Stephen signature.asc Description: Digital signature
Re: Does PostgreSQL check database integrity at startup?
Stephen Frost wrote: > * Edson Carlos Ericksson Richter (rich...@simkorp.com.br) wrote: > > Anyway, instead digging into rsync functionality (or bugs - I doubt, > > but who knows?), I do prefer to have a script I can run to check if > > there is obvious failures in standby servers. > > As mentioned, zero-byte files can be perfectly valid. PostgreSQL does > have page-level CRCs, if you initialized your database with them (which > I would strongly recommend). Page-level checksums would not detect the problem being complained in this thread, however. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: psycopg2 and java gssapi questions
What about when the ticket expires? Are there any libraries that manage this for the application? Is this common practice by anyone? Mike -Original Message- From: Dave CramerTo: Magnus Hagander Cc: Mike Feld ; pgsql-general Sent: Thu, Dec 21, 2017 6:09 am Subject: Re: psycopg2 and java gssapi questions On 21 December 2017 at 05:27, Magnus Hagander wrote: On Wed, Dec 20, 2017 at 8:42 PM, Mike Feld wrote: Is it possible to authenticate with Postgres from astandalone application using gssapi? In other words, I am able to authenticatewith Postgres when a human has logged in to either Windows or Linux andgenerated a ticket, but is it possible for say a Django site or Javaapplication running on some server somewhere to authenticate with Postgresusing gssapi? I realize that psycopg2 has a connection parameter for “krbsrvname”,but how does it generate a ticket? Is this the only alternative to secure authentication since Postgres does not support secure ldap (ldaps)? Sure it is. libpq won't generate the initial ticket, though. The way to do it is to have your django or whatever application run "kinit" for the user before it starts. This will request a TGT, and the ticket will be present in that users environment, and will be used by the libpq client. (it might look slightly different for a Java client, but the principle is the same) JDBC docs on GSSAPI can be found https://jdbc.postgresql.org/documentation/head/connect.html Dave Cramer da...@postgresintl.com www.postgresintl.com
Re: Does PostgreSQL check database integrity at startup?
Em 26/12/2017 20:11, rob stone escreveu: Hello, On Tue, 2017-12-26 at 18:58 -0300, Alvaro Herrera wrote:Hello, David Steele wrote: pgBackRest will validate all page checksums (including indexes, etc.) in the cluster during backup. Full backups check everything, incr/differential backups check only the files that have changed. If a table or index file is of zero length when backed up, as in the described case, nothing will be checked, right? I mean, there is nothing externally indicating that the file ought to be of a different size. Am I wrong? So Edson's situation here would not raise any red flags. Could the following occur:- 1) Your app. issues a BEGIN followed by an INSERT. 2) Postgres decides to open a new file in order to store the new row. 3) Your app. then does a ROLLBACK. Wouldn't that leave you with a zero length file on disk? There's no reason for Postgres to delete the file just because a rollback was issued. All it has to do is clear the buffer in memory. My 2 cents. Rob You are right, in several databases I've looked there are many files with 0 byte size in ./base folder. This is not a good quest. Regards, Edson