[
https://issues.apache.org/jira/browse/PARQUET-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gidon Gershinsky updated PARQUET-1376:
--------------------------------------
Description:
Data obfuscation in sensitive columns - for users without access to column
encryption keys.
# Implement on top of [basic Parquet
encryption|https://github.com/apache/parquet-format/blob/encryption/Encryption.md]
# Built-in support for multiple masking mechanisms, with different trade-off
between data utility, leakage, and size/throughput overhead
# Provide interface for plug-in custom masking mechanism
# Enable storing multiple masked versions of the same column in a file
# Provide readers with explicit list of column’s masked versions in a file
# Enable readers to select a masked version of a column
# Stretch: Implement tools for analysis of file data privacy properties and
information leakage
# Stretch: Leverage privacy analysis tools for tuning file data anonymity
# Optional: Support aggregated obfuscation
was:
Anonymity layer for hidden columns
# Different data masking options
** per-cell
** aggregated (average, etc)
# Reader notification on data access status
# Providing readers with a choice of masking options (if available)
> Data obfuscation layer for encryption
> -------------------------------------
>
> Key: PARQUET-1376
> URL: https://issues.apache.org/jira/browse/PARQUET-1376
> Project: Parquet
> Issue Type: New Feature
> Reporter: Gidon Gershinsky
> Assignee: Gidon Gershinsky
> Priority: Major
>
> Data obfuscation in sensitive columns - for users without access to column
> encryption keys.
> # Implement on top of [basic Parquet
> encryption|https://github.com/apache/parquet-format/blob/encryption/Encryption.md]
>
> # Built-in support for multiple masking mechanisms, with different trade-off
> between data utility, leakage, and size/throughput overhead
> # Provide interface for plug-in custom masking mechanism
> # Enable storing multiple masked versions of the same column in a file
> # Provide readers with explicit list of column’s masked versions in a file
> # Enable readers to select a masked version of a column
> # Stretch: Implement tools for analysis of file data privacy properties and
> information leakage
> # Stretch: Leverage privacy analysis tools for tuning file data anonymity
> # Optional: Support aggregated obfuscation
--
This message was sent by Atlassian Jira
(v8.3.4#803005)