Re: [HACKERS] WIP: Data at rest encryption

Robert Haas Tue, 13 Jun 2017 08:36:04 -0700

On Mon, Jun 12, 2017 at 5:11 PM, Ants Aasma <[email protected]> wrote:
> Fundamentally there doesn't seem to be a big benefit of implementing
> the encryption at PostgreSQL level instead of the filesystem. The
> patch doesn't take any real advantage from the higher level knowledge
> of the system, nor do I see much possibility for it to do that. The
> main benefit for us is that it's much easier to get a PostgreSQL based
> solution deployed.


I agree with all of that, but ease of deployment has some value unto
itself.  I think pretty much every modern operating system has some
way of encrypting a filesystem, but it's different on Linux vs.
Windows vs. macOS vs. BSD, and you probably need to be the system
administrator on any of those systems in order to set it up.
Something built into PostgreSQL could run without administrator
privileges and work the same way on every platform we support.  That
would be useful.

Of course, what would be even more useful is fine-grained encryption -
encrypt these tables (and the corresponding indexes, toast tables, and
WAL records related to any of that) with this key, encrypt these other
tables (and the same list of associated stuff) with this other key,
and leave the rest unencrypted.  The problem with that is that you
probably can't run recovery without all of the keys, and even on a
clean startup there would be a good deal of engineering work involved
in refusing access to tables whose key hadn't been provided yet.  I
don't think we should wait to have this feature until all of those
problems are solved.  In my opinion, something coarse-grained that
just encrypts the whole cluster would be a pretty useful place to
start and would meet the needs of enough people to be worthwhile all
on its own.  Performance is likely to be poor on large databases,
because every time a page transits between shared_buffers and the
buffer cache we've got to en/decrypt, but as long as it's only poor
for the people who opt into the feature I don't see a big problem with
that.

I anticipate that one of the trickier problems here will be handling
encryption of the write-ahead log.  Suppose you encrypt WAL a block at
a time.  In the current system, once you've written and flushed a
block, you can consider it durably committed, but if that block is
encrypted, this is no longer true.  A crash might tear the block,
making it impossible to decrypt.  Replay will therefore stop at the
end of the previous block, not at the last record actually flushed as
would happen today.  So, your synchronous_commit suddenly isn't.  A
similar problem will occur any other page where we choose not to
protect against torn pages using full page writes.  For instance,
unless checksums are enabled or wal_log_hints=on, we'll write a data
page where a single bit has been flipped and assume that the bit will
either make it to disk or not; the page can't really be torn in any
way that hurts us.  But with encryption that's no longer true, because
the hint bit will turn into much more than a single bit flip, and
rereading that page with half old and half new contents will be the
end of the world (TM).  I don't know off-hand whether we're
protecting, say, CLOG page writes with FPWs.: because setting a couple
of bits is idempotent and doesn't depend on the existing page
contents, we might not need it currently, but with encryption, every
bit in the page depends on every other bit in the page, so we
certainly would.  I don't know how many places we've got assumptions
like this baked into the system, but I'm guessing there are a bunch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

Reply via email to