Re: Transparent Data Encryption (TDE) and encrypted files

Stephen Frost Wed, 09 Oct 2019 07:30:46 -0700

Greetings,

* Magnus Hagander (mag...@hagander.net) wrote:
> On Thu, Oct 3, 2019 at 4:40 PM Stephen Frost <sfr...@snowman.net> wrote:
> > * Robert Haas (robertmh...@gmail.com) wrote:
> > > On Mon, Sep 30, 2019 at 5:26 PM Bruce Momjian <br...@momjian.us> wrote:
> > > > For full-cluster Transparent Data Encryption (TDE), the current plan is
> > > > to encrypt all heap and index files, WAL, and all pgsql_tmp (work_mem
> > > > overflow).  The plan is:
> > > >
> > > >
> > https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption
> > > >
> > > > We don't see much value to encrypting vm, fsm, pg_xact, pg_multixact,
> > or
> > > > other files.  Is that correct?  Do any other PGDATA files contain user
> > > > data?
> > >
> > > As others have said, that sounds wrong to me.  I think you need to
> > > encrypt everything.
> >
> > That isn't what other database systems do though and isn't what people
> > actually asking for this feature are expecting to have or deal with.
> 
> Do any of said other database even *have* the equivalence of say pg_clog or
> pg_multixact *stored outside their tablespaces*? (Because as long as the
> data is in the tablespace, it's encrypted when using tablespace
> encryption..)


That's a fair question and while I'm not specifically sure about all of
them, I do believe you're right that for some, the tablespace/database
includes that information (and WAL) instead of having it external.  I'm
also pretty sure that there's still enough information that isn't
encrypted to at least *start* the database server.  In many ways, we are
unfortunately the oddball when it comes to having these cluster-level
things that we probably do want to encrypt (I'd be thinking more about
pg_authid here than clog, and potentially the WAL).

I've been meaning to write up a wiki page or something on this but I
just haven't found time, so I'm going to give up on that and just share
my thoughts here and folks can do with them what they wish-

When it comes to use-cases and attack vectors, I feel like there's
really two "big" choices, and I'd like us to support both, ideally, but
it boils down to this: do you trust the database maintenance, et al,
processes, or no?  The same question, put another way, is, do you trust
having unencrypted/sensitive data in shared buffers?

Let's talk through these for a minute:

Yes, shared_buffers is trusted implies:

- More data (usefully or not) can be encrypted
  - WAL, clog, multixact, pg statistics, et al
- Various PG processes need to know the decryption keys necessary
  (autovacuum, crash recovery, being big ones)
  ... ideally, we could still *start*, which is why I continue to argue
  that we shouldn't encrypt *everything* because not being able to even
  start the database system really sucks.  What exactly it is that we
  need I don't know off-hand, maybe we don't need clog, but it seems
  likely we'll need pg_controldata, for example.  My gut feeling on this
  is really that we need enough to start and open up the vault- which
  probably means that the vault needs to look more like what I describe
  below in the situation where you don't trust shared_buffers, to the
  point where we might have seperate WAL/clog/et al for the vault itself
- Fewer limitations (indexes can work more-or-less as-is, for example)
- Attack vectors:
  - Anything that can access shared buffers can get a ton of data
  - Bugs in PG that expose memory can be leveraged to get access to data
    and keys
  - root on the system can pretty trivially gain access to everything
  - If someone steals the disks/backups, they can't get access to much
  - Or, if your cloud/storage vendor decides to snoop around they can't
    see much

No, shared_buffers is NOT trusted implies:

- we need enough unencrypted data to bring the system up and online and
  working (crash recovery, autovacuum, need to work)- this likely
  implies that things like WAL, clog, et al, have to be mostly
  unencrypted, to allow these processes to work
- Limitations on indexes (we can't have the index have unencrypted data,
  but we also have to have autovacuum able to work...  I actually wonder
  if this might be something we could solve by encrypting the internal
  pages, leaving the TIDs exposed so that they can be cleaned up but
  leaf pages have their own ordering so that's not great...  I suspect
  something like this is the reason for the index limitation in other
  database systems that support column-level encryption)
- Sensitive data in WAL is already encrypted
- All decryption happens in a given backend when it's sending data to
  the client
- Attack vectors:
  - root can watch network traffic or individual sessions, possibly gain
    access to keys (certainly with more difficulty though)
  - Bugs in PG shouldn't make it very easy for an external attacker to
    gain access to anything except what they already had access to
    (sure, they could see shared buffers and see what's in their
    backend, but everything in shared buffers that's sensitive should be
    encrypted, and for the most part what's in their backend should only
    be things they're allowed to access anyway)
  - If someone steals the disks/backups, they could potentially figure
    out more information about what was happening on the system
  - Or, if your cloud/storage vendor decides to snoop around, they could
    possibly figure things out

And then, of course, you can get into the fun of, well, maybe we should
have both options be supported at the same time.

Looking from an attack-vector standpoint, if the concern is primairly
about external attackers through SQL injection and database bugs, not
trusting shared buffers is pretty clearly the way to go.  If the concern
is about stealing hard drives or backups, well, FDE is a great solution
there, along with encrypted backups, but, sure, if we rule those out for
some reason then we can say that, yes, this will be helpful for that
kind of an attack.

In either case, we do need a vaulting system, and I think we need to be
able to start up PG and get the vault open and accept connections.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: Transparent Data Encryption (TDE) and encrypted files

Reply via email to