On 2/20/17 11:22 AM, David Christensen wrote:
- If an entire cluster is going to be considered as checksummed, then even 
databases that don't allow connections would need to get enabled.
Yeah, the workaround for now would be to require “datallowconn" to be set to 
’t' for all databases before proceeding, unless there’s a way to connect to those 
databases internally that bypasses that check.  Open to ideas, though for a first 
pass seems like the “datallowconn” approach is the least amount of work.

The problem with ignoring datallowconn is any database where that's false is fair game for CREATE DATABASE. I think just enforcing that everything's connectable is good enough for now.

I like the idea of revalidation, but I'd suggest leaving that off of the first 
pass.
Yeah, agreed.

It might be easier on a first pass to look at supporting per-database checksums 
(in this case, essentially treating shared catalogs as their own database). All 
normal backends do per-database stuff (such as setting current_database) during 
startup anyway. That doesn't really help for things like recovery and 
replication though. :/ And there's still the question of SLRUs (or are those 
not checksum'd today??).
So you’re suggesting that the data_checksums GUC get set per-database context, 
so once it’s fully enabled in the specific database it treats it as in 
enforcing state, even if the rest of the cluster hasn’t completed?  Hmm, might 
think on that a bit, but it seems pretty straightforward.

Something like that, yeah.

What issues do you see affecting replication and recovery specifically (other 
than the entire cluster not being complete)?  Since the checksum changes are 
WAL logged, seems you be no worse the wear on a standby if you had to change 
things.

I'm specifically worried about the entire cluster not being complete. That makes it harder for replicas to know what blocks they can and can't verify the checksum on.

That *might* still be simpler than trying to handle converting the entire cluster in one shot. If it's not simpler I certainly wouldn't do it right now.

BTW, it occurs to me that this is related to the problem we have with trying to make changes that 
break page binary compatibility. If we had a method for handling that it would probably be useful 
for enabling checksums as well. You'd essentially treat an un-checksum'd page as if it was an 
"old page version". The biggest problem there is dealing with the potential that the new 
page needs to be larger than the old one was, but maybe there's some useful progress to be had in 
this area before tackling the "page too small" problem.
I agree it’s very similar; my issue is I don’t want to have to postpone 
handling a specific case for some future infrastructure.

Yeah, I was just mentioning it.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to