Dear Masahiko Sawada. > -----Original Message----- > From: Masahiko Sawada [mailto:sawada.m...@gmail.com] > Sent: Monday, June 11, 2018 6:22 PM > To: Moon, Insung > Cc: PostgreSQL-development; Joe Conway > Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key > Management Service (KMS) > > On Fri, May 25, 2018 at 8:41 PM, Moon, Insung <moon_insung...@lab.ntt.co.jp> > wrote: > > Hello Hackers, > > > > This propose a way to develop "Table-level" Transparent Data > > Encryption (TDE) and Key Management Service (KMS) support in PostgreSQL. > > > > > > Issues on data encryption of PostgreSQL ========== Currently, in > > PostgreSQL, data encryption can be using pgcrypto Tool. > > However, it is inconvenient to use pgcrypto to encrypts data in some cases. > > > > There are two significant inconveniences. > > > > First, if we use pgcrypto to encrypt/decrypt data, we must call pgcrypto > > functions everywhere we encrypt/decrypt. > > Second, we must modify application program code much if we want to do > > database migration to PostgreSQL from other databases that is using TDE. > > > > To resolved these inconveniences, many users want to support TDE. > > There have also been a few proposals, comments, and questions to support > > TDE in the PostgreSQL community. > > > > However, currently PostgreSQL does not support TDE, so in development > > community, there are discussions whether it's necessary to support TDE or > > not. > > > > In these discussions, there were requirements necessary to support TDE in > > PostgreSQL. > > > > 1) The performance overhead of encryption and decryption database data > > must be minimized > > 2) Need to support WAL encryption. > > 3) Need to support Key Management Service. > > > > Therefore, I'd like to propose the new design of TDE that deals with both > > above requirements. > > Since this feature will become very large, I'd like to hear opinions from > > community before starting making the patch. > > > > First, my proposal is table-level TDE which is that user can specify tables > > begin encrypted. > > Indexes, TOAST table and WAL associated with the table that enables TDE are > > also encrypted. > > > > Moreover, I want to support encryption for large object as well. > > But I haven't found a good way for it so far. So I'd like to remain it as > > future TODO. > > > > My proposal has five characteristics features of "table-level TDE". > > > > 1) Buffer-level data encryption and decryption > > 2) Per-table encryption > > 3) 2-tier encryption key management > > 4) Working with external key management services(KMS) > > 5) WAL encryption > > > > Here are more details for each items. > > > > > > 1. Buffer-level data encryption and decryption ================== > > Transparent data encryption and decryption accompany by storage > > operation With ordinally way like using pgcrypto, the biggest problem > > with encrypted data is the performance overhead of decrypting the data each > > time the run to queries. > > > > My proposal is to encrypt and decrypt data when performing DISK I/O > > operation to minimize performance overhead. > > Therefore, the data in the shared memory layer is unencrypted so that > > performance overhead can minimize. > > > > With this design, data encryption/decryption implementations can be > > developed by modifying the codes of the storage and buffer manager > > modules, which are responsible for performing DISK I/O operation. > > > > > > 2. Per-table encryption > > ================== > > User can enable TDE per table as they want. > > I introduce new storage parameter "encryption_enabled" which enables TDE at > > table-level. > > > > // Generate the encryption table > > CREATE TABLE foo WITH ( ENCRYPTION_ENABLED = ON ); > > > > // Change to the non-encryption table > > ALTER TABLE foo SET ( ENCRYPTION_ENABLED = OFF ); > > > > This approach minimizes the overhead for tables that do not require > > encryption options. > > For tables that enable TDE, the corresponding table key will be > > generated with random values, and it's stored into the new system catalog > > after being encrypted by the master key. > > > > BTW, I want to support CBC mode encryption[3]. However, I'm not sure how to > > use the IV in CBC mode for this proposal. > > I'd like to hear opinions by security engineer. > > > > > > 3. 2-tier encryption key management > > ================== > > when it comes time to change cryptographic keys, there is a performance > > overhead to decryption and re-encryption to > all data. > > > > To solve this problem we employee 2-tier encryption. > > 2-tier encryption is All table keys can be stored in the database > > cluster after being encrypted by the master key, And master keys must be > > stored at external of PostgreSQL. > > > > Therefore, without master key, it is impossible to decrypt the table key. > > Thus, It is impossible to decrypt the database > data. > > > > When changing the key, it's not necessary to re-encrypt for all data. > > We use the new master key only to decrypt and re-encrypt the table key, > > these operations for minimizing the performance > overhead. > > > > For table keys, all TDE-enabled tables have different table keys. > > And for master key, all database have different master keys. Table keys are > > encrypted by the master key of its own database. > > For WAL encryption, we have another cryptographic key. WAL-key is also > > encrypted by a master key, but it is shared across the database cluster. > > > > > > 4. Working with external key management services(KMS) > > ================== A key management service is an integrated approach > > for generating, fetching and managing encryption keys for key control. > > They may cover all aspects of security from the secure generation of > > keys, secure storing keys, and secure fetching keys up to encryption key > > handling. > > Also, various types of KMSs are provided by many companies, and users can > > choose them. > > > > Therefore I would like to manage the master key using KMS. > > Also, my proposal is to create callback APIs(generate_key, fetch_key, > > store_key) in the form of a plug-in so that users can use many types of KMS > > as they want. > > > > In KMIP protocol and most KMS manage keys by string IDs. We can get keys by > > key ID from KMS. > > So in my proposal, all master keys are distinguished by its ID, called > > "master key ID". > > The master key ID is made, for example, using the database oid and a > > sequence number, like <OID>_<SeqNo>. And they are managed in PostgreSQL. > > > > When database startup, all master key ID is loaded to shared memory, and > > they are protected by LWLock. > > > > When it comes time to rotate the master keys, run this query. > > > > ALTER SYSTEM ROTATION MASTER KEY; > > > > In this query, the master key is rotated with the following step. > > 1. Generate new master key, > > 2. Change master key IDs and emit corresponding WAL 3. Re-encrypt all > > table keys on its database > > > > Also during checkpoint, master key IDs on shared memory become a permanent > > condition. > > > > > > 5. WAL encryption > > ================== > > If we encrypt all WAL records, performance overhead can be significant. > > Therefore, this proposes a method to encrypt only WAL record excluding > > WAL header when writing WAL on the WAL buffer, instead of encrypting a > > whole WAL record. > > WAL encryption key is generated separately when the TDE-enabled table > > is created the first time. We use 2-tier encryption for WAL encryption as > > well. > > So, when it comes time to rotate the WAL encryption key, run this query. > > > > ALTER SYSTEM ROTATION WAL KEY; > > > > Next, I will explain how to encrypt WAL. > > > > To do this operation, I add a flag to WAL header which indicates whether > > the subsequent WAL data is encrypted or not. > > > > Then, when we write WAL for encryption table we write "encrypted" WAL on > > WAL buffer layer. > > > > In recovery, we read WAL header and check the flag of encryption, and > > judges whether WAL must be decrypted. > > In the case of PITR, we use WAL key ID in the backup file. > > > > With this approach, the performance overhead of writing and reading > > the WAL for unencrypted tables would be almost the same as before. > > > > > > ================== > > I'd like to discuss the design before starting making any change of code. > > After a more discussion I want to make a PoC. > > Feedback and suggestion are very welcome. > > > > Finally, thank you initial design input for Masahiko Sawada. > > > > Thank you. > > > > [1] What does TDE mean? > > > https://en.wikipedia.org/wiki/Transparent_Data_Encryption > > > > [2] What does KMS mean? > > > > > https://en.wikipedia.org/wiki/Key_management#Key_Management_System > > > > [3] What does CBC-Mode mean? > > > https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation > > > > [4] Recently discussed mail > > > > https://www.postgresql.org/message-id/CA%2BCSw_tb3bk5i7if6inZFc3yyf%2B > > 9HEVNTy51QFBoeUk7UE_V%3Dw%40mail.gmail.com > > > > > > As per discussion at PGCon unconference, I think that firstly we need to > discuss what threats we want to defend database > data against. If user wants to defend against a threat that is malicious user > who logged in OS or database steals an important > data on datbase this design TDE would not help. Because such user can steal > the data by getting a memory dump or by SQL. > That is of course differs depending on system requirements or security > compliance but what threats do you want to defend > database data against? and why?
Yes. I'm Checking to the requirement 3.4 of PCI-DSS. This requirement is a refer to encrypting stored data. And idea does not protect data against memory dump(include coredump). If required for an encryption of memory layer, I'll recheck to this idea. And I will do a little more research on any enterprise requirement on encryption data. > > Also, if I understand correctly, at unconference session there also were two > suggestions about the design other than the > suggestion by > Alexander: implementing TDE at column level using POLICY, and implementing > TDE at table-space level. The former was suggested > by Joe but I'm not sure the detail of that suggestion. I'd love to hear the > deal of that suggestion. The latter was suggested > by Tsunakawa-san. > Have you considered that? First, thank you for Joe and Tsunakawa-san. I'm thinking of table-level encrypting, but I'll try to find the best way through this discussion. > > You mentioned that encryption of temporary data for query processing and > large objects are still under the consideration. > But other than them you should consider the temporary data generated by other > subsystems such as reorderbuffer and transition > table as well. Yes. Encryption of temporary data and large objects and anymore is considered essential. In this case, I have not yet decided how to encrypt temporary data. I'll make PoC patch, and find to how to encryption of temporary data. Thank you and Best regards. Moon. > > Regards, > > -- > Masahiko Sawada > NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center