Hi all,

Now that the encryption mechanism is mostly complete, we are starting a
long-term project on  a new security feature on top of encryption. Called
"data obfuscation",  it combines masking and anonymization of sensitive
data.
https://issues.apache.org/jira/browse/PARQUET-1376

On the one hand, a basic masking can be easily implemented on top of
Parquet, by simply adding columns with masked (hashed, redacted, etc)
versions of the original column data. On the other hand, if done
improperly, data masking can leak out the sensitive information. For these
two reasons, we have decided not to rush it, this feature is not planned
for the upcoming Parquet versions. Following an initial discussion, we have
produced a write up on the goals, challenges and possible approaches.
Before drafting the design, we start with a call to the community to
provide feedback on this write up (eg via comments inside the doc). Any
real-life examples, usecases, requirements are very welcome.

https://docs.google.com/document/d/1LMs74uhqvMNJacBySPnWq6tM8qIpgcIZz444c7vfibM/edit?usp=sharing


Cheers,
Gidon, Xinli, Shri

Reply via email to