[
https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867785#comment-16867785
]
Gidon Gershinsky edited comment on HIVE-21848 at 6/19/19 4:26 PM:
------------------------------------------------------------------
[[email protected]], a few comments:
for either footer or columns, key metadata should not be passed as a property.
Instead, it should be derived from the properties (such as key names, wrapping
method, KMS type, etc).
on the other hand, a few substantial properties are missing in your list (like
key names, token, etc)
actually, we have a draft that already defines the Parquet encryption
properties, please have a look at
[https://docs.google.com/document/d/1boH6HPkG0ZhgxcaRkGk3QpZ8X_J91uXZwVGwYN45St4/edit?usp=sharing]
It had not been reviewed by the community yet, so its a bit early to try to
unify ORC and Parquet properties. We might find at the end that the differences
are bigger than the common. But in any case, I think this exercise of finding
the common is helpful; its just a bit early at this point.
was (Author: gershinsky):
[[email protected]], a few comments:
* for either footer or columns, key metadata should not be passed as a
property. Instead, it should be derived from the properties (such as key names,
wrapping method, KMS type, etc).
* on the other hand, a few substantial properties are missing in your list
(like KMS client type, token, etc)
* actually, we have a draft that already defines the Parquet encryption
properties, please have a look at
> Table property name definition between ORC and Parquet encrytion
> ----------------------------------------------------------------
>
> Key: HIVE-21848
> URL: https://issues.apache.org/jira/browse/HIVE-21848
> Project: Hive
> Issue Type: Task
> Components: Metastore
> Affects Versions: 3.0.0
> Reporter: Xinli Shang
> Assignee: Xinli Shang
> Priority: Major
> Fix For: 3.0.0
>
>
> The goal of this Jira is to define a superset of unified table property names
> that can be used for both Parquet and ORC column encryption. There is no code
> change needed for this Jira.
> *Background:*
> ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To
> configure the encryption, e.g. which column is sensitive, what master key to
> be used, algorithm, etc, table properties can be used. It is important that
> both Parquet and ORC can use unified names.
> According to the slide
> [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692],
> ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in
> the Parquet community, it is still discussing to provide several ways and
> using table properties is one of the options, while there is no detailed
> design of the table property names yet.
> So it is a good time to discuss within two communities to have unified table
> names as a superset.
> *Proposal:*
> There are several encryption properties that need to be specified for a
> table. Here is the list. This is the superset of Parquet and ORC. Some of
> them might not apply to both.
> # PII columns including nest columns
> # Column key metadata, master key metadata
> # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR.
> ORC might support AES_CTR.
> # Encryption footer - Parquet allow footer to be encrypted or plaintext
> # Footer key metadata
> Here is the table properties proposal.
> |*Table Property Name*|*Value*|*Notes*|
> |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.|
> |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted
> footer. By default, it is encrypted.|
> |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to
> the KMS to define what key metadata is. The metadata should have enough
> information to figure out the corresponding key by the KMS. |
> |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column
> name for example, ‘address.zipcode’.
>
> It is up to the KMS to define what key metadata is. The metadata should have
> enough information to figure out the corresponding key by the KMS.|
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)