Hello,

apparently, modular encryption does not yet support **arrays** types.

```scala
spark.sparkContext.hadoopConfiguration.set("parquet.crypto.factory.class", 
"org.apache.parquet.crypto.keytools.PropertiesDrivenCryptoFactory")
spark.sparkContext.hadoopConfiguration.set("parquet.encryption.kms.client.class"
 , "org.apache.parquet.crypto.keytools.mocks.InMemoryKMS")
spark.sparkContext.hadoopConfiguration.set("parquet.encryption.key.list", 
"k1:AAECAwQFBgcICQoLDA0ODw==, k2:AAECAAECAAECAAECAAECAA==")
spark.sparkContext.hadoopConfiguration.set("parquet.encryption.plaintext.footer",
 "true")
spark.sparkContext.hadoopConfiguration.set("parquet.encryption.footer.key", 
"k1")
spark.sparkContext.hadoopConfiguration.set("parquet.encryption.column.keys", 
"k2:rider")

val df = spark.sql("select 1 as foo, array(named_struct('foo',2, 'bar',3)) as 
rider, 3 as ts, uuid() as uuid")
df.write.format("parquet").mode("overwrite").save("/tmp/enc")

Caused by: org.apache.parquet.crypto.ParquetCryptoRuntimeException: Encrypted 
column [rider] not in file schema

```

also, the doted columnpath would not support to encrypt within nested 
structure mixed with arrays. For example, there is no way I am aware of to 
target "all foo in rider".

```
root
 |-- foo: integer (nullable = false)
 |-- rider: array (nullable = false)
 |    |-- element: struct (containsNull = false)
 |    |    |-- foo: integer (nullable = false)
 |    |    |-- bar: integer (nullable = false)
 |-- ts: integer (nullable = false)
 |-- uuid: string (nullable = false)
```

so far, those two issues makes arrays of confidential information impossible to 
encrypt, or am I missing something ?

Thanks, 

Reply via email to