[
https://issues.apache.org/jira/browse/PARQUET-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17677533#comment-17677533
]
ASF GitHub Bot commented on PARQUET-2223:
-----------------------------------------
zhangjiashen commented on PR #1016:
URL: https://github.com/apache/parquet-mr/pull/1016#issuecomment-1384639397
> I found the doc. Could you provide me with a "comment" access, so we'll
discuss the goals and design there? Thanks.
@ggershinsky thanks for looking at this, I have added permission for you,
feel free to add questions/comments!
> Parquet Data Masking for Column Encryption
> ------------------------------------------
>
> Key: PARQUET-2223
> URL: https://issues.apache.org/jira/browse/PARQUET-2223
> Project: Parquet
> Issue Type: Task
> Reporter: Jiashen Zhang
> Priority: Minor
>
> h1. Background
> h2. What is Data Masking?
> Data masking is the process of obfuscating sensitive data. Instead of
> revealing PII data, masking allows us to return NULLs, hashes or redacted
> data in its place. With data masking, users who are in the correct permission
> groups can retrieve the original data and users without permissions will
> receive masked data.
> h2. Why do we need it?
> * Fined-Grained Access Control
> h2. Why do we want to enhance data masking?
>
> Users might not have all permissions for all columns, existing code doesn’t
> have support for us to skip columns that users don’t have permissions to
> access. This enhancement will add this support so that users can decide to
> skip some columns to avoid decryption error.
> h1. Design Requirements
> # Users can skip some columns with a configuration
> h1. Proposed solution
> Key idea is to modify the request schema by removing skipped columns from the
> schema.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)