hello,

as mentionned in several places [1], from a data analyst point of view,
having null values for encrypted columns when one has no key to decrypt
is better than getting exceptions, and ease the data exploration
allowing select * instead of writing each allowed columns.

I have been digging the crypto source code to find a easy way to catch
crypto exception and turn values to null from the
DecryptionPropertiesFactory that can be passed to the query engine
thought hadoop configs.

I might be missing something, but I haven't found a way to tell the
ParquetReader to put nulls and go ahead reading un-encrypted columns
when something get wrong with the KMS.

Is such behavior available or are you willing to add such feature at
parquet level in the future ? 

Thanks


[1]
https://www.uber.com/en-FR/blog/one-stone-three-birds-finer-grained-encryption-apache-parquet/

Reply via email to