[ 
https://issues.apache.org/jira/browse/PARQUET-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Or Ozeri updated PARQUET-1702:
------------------------------
    Description: 
The newly added parquet encryption feature currently works only with 
SerializedRowGroupWriter.
There are several issues preventing the use of BufferedRowGroupWriter with 
encryption enabled:

1. Meta encryptor not passed on to ColumnChunkMetaDataBuilder::Finish. This can 
trigger a null-pointer dereference (reported as segmentation fault).
2. UpdateEncryption not called on Close, resulting in an incorrect AAD string 
when encrypting the column chunk metadata.
3. The column ordinal passed on to PageWriter::Open is always zero, resulting 
in a wrong AAD string when encrypting the columns data (except for the first 
column).
4. When decrypting a column chunk with no dictionary pages, PARQUET-1706 
confuses the decryptor to think it is decrypting a dictionary page, which again 
causes a wrong AAD string to be used when decrypting.

We propose a patch (few dozen lines) to fix the above issues.
We also extend the current parquet-encryption-test unit test, which tests 
SerializedRowGroupWriter, to test also with BufferedRowGroupWriter.

  was:
When working with buffered row group writer, the column ordinal, required by 
encryption, is not updated.

[The column ordinal does get updated on the more common flow of non-buffered 
row group writer (using NextColumn() interface instead of column(i)).]

        Summary: [C++] Make BufferedRowGroupWriter compatible with parquet 
encryption  (was: [C++] Missing column ordinal update when using buffered row 
group writer)

> [C++] Make BufferedRowGroupWriter compatible with parquet encryption
> --------------------------------------------------------------------
>
>                 Key: PARQUET-1702
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1702
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cpp
>    Affects Versions: cpp-1.6.0
>            Reporter: Or Ozeri
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The newly added parquet encryption feature currently works only with 
> SerializedRowGroupWriter.
> There are several issues preventing the use of BufferedRowGroupWriter with 
> encryption enabled:
> 1. Meta encryptor not passed on to ColumnChunkMetaDataBuilder::Finish. This 
> can trigger a null-pointer dereference (reported as segmentation fault).
> 2. UpdateEncryption not called on Close, resulting in an incorrect AAD string 
> when encrypting the column chunk metadata.
> 3. The column ordinal passed on to PageWriter::Open is always zero, resulting 
> in a wrong AAD string when encrypting the columns data (except for the first 
> column).
> 4. When decrypting a column chunk with no dictionary pages, PARQUET-1706 
> confuses the decryptor to think it is decrypting a dictionary page, which 
> again causes a wrong AAD string to be used when decrypting.
> We propose a patch (few dozen lines) to fix the above issues.
> We also extend the current parquet-encryption-test unit test, which tests 
> SerializedRowGroupWriter, to test also with BufferedRowGroupWriter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to