ggershinsky commented on PR #1016: URL: https://github.com/apache/parquet-mr/pull/1016#issuecomment-1383552737
As far as I understand, _data masking_ replaces content of sensitive columns; it does not remove the columns (schema and content). The latter is done by _column pruning_ - when re-writing a file. All of that is not related to _column encryption_. So I'm not fully sure what is the goal of the mechanism in this PR. Maybe we can start with a googledoc that describes the problem, the goals and the solution design? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org