[GitHub] [parquet-format] ggershinsky commented on pull request #110: PARQUET-1232: Encryption docs

GitHub Sun, 07 Oct 2018 23:46:06 -0700

Two reasons:
1. All column-specific metadata is specified today in RowGroups. Things like 
compression algorithms (which, I believe, in theory could be column-specific, 
but usually are file-wide) are repeatedly set in all columns in all row groups.
2. Not less (and probably more) important - in the future, we'll likely add 
support for using different encryption keys for different row groups. This 
scenario is raised from time to time in discussions. For example, a time series 
scenario - a user has access to a certain time span, but not all data (eg to 
enable user access revocation at some point in time). Another scenario is using 
row groups for other types of data grouping in large files, eg by geography etc 
- different keys will allow for a corresponding access control.


[ Full content available at: https://github.com/apache/parquet-format/pull/110 ]
This message was relayed via gitbox.apache.org for [email protected]

[GitHub] [parquet-format] ggershinsky commented on pull request #110: PARQUET-1232: Encryption docs

Reply via email to