[ https://issues.apache.org/jira/browse/PARQUET-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17676963#comment-17676963 ]
ASF GitHub Bot commented on PARQUET-2223: ----------------------------------------- shangxinli commented on PR #1016: URL: https://github.com/apache/parquet-mr/pull/1016#issuecomment-1383006808 @ggershinsky Do you have time to have a look? > Parquet Data Masking for Column Encryption > ------------------------------------------ > > Key: PARQUET-2223 > URL: https://issues.apache.org/jira/browse/PARQUET-2223 > Project: Parquet > Issue Type: Task > Reporter: Jiashen Zhang > Priority: Minor > > h1. Background > h2. What is Data Masking? > Data masking is the process of obfuscating sensitive data. Instead of > revealing PII data, masking allows us to return NULLs, hashes or redacted > data in its place. With data masking, users who are in the correct permission > groups can retrieve the original data and users without permissions will > receive masked data. > h2. Why do we need it? > * Fined-Grained Access Control > h2. Why do we want to enhance data masking? > > Users might not have all permissions for all columns, existing code doesn’t > have support for us to skip columns that users don’t have permissions to > access. This enhancement will add this support so that users can decide to > skip some columns to avoid decryption error. > h1. Design Requirements > # Users can skip some columns with a configuration > h1. Proposed solution > Key idea is to modify the request schema by removing skipped columns from the > schema. -- This message was sent by Atlassian Jira (v8.20.10#820010)