[
https://issues.apache.org/jira/browse/PARQUET-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Or Ozeri updated PARQUET-1702:
------------------------------
Description:
The newly added parquet encryption feature currently works only with
SerializedRowGroupWriter.
There are several issues preventing the use of BufferedRowGroupWriter with
encryption enabled:
1. Meta encryptor not passed on to ColumnChunkMetaDataBuilder::Finish. This can
trigger a null-pointer dereference (reported as segmentation fault).
2. UpdateEncryption not called on Close, resulting in an incorrect AAD string
when encrypting the column chunk metadata.
3. The column ordinal passed on to PageWriter::Open is always zero, resulting
in a wrong AAD string when encrypting the columns data (except for the first
column).
4. When decrypting a column chunk with no dictionary pages, PARQUET-1706
confuses the decryptor to think it is decrypting a dictionary page, which again
causes a wrong AAD string to be used when decrypting.
We propose a patch (few dozen lines) to fix the above issues.
We also extend the current parquet-encryption-test unit test, which tests
SerializedRowGroupWriter, to test also with BufferedRowGroupWriter.
was:
When working with buffered row group writer, the column ordinal, required by
encryption, is not updated.
[The column ordinal does get updated on the more common flow of non-buffered
row group writer (using NextColumn() interface instead of column(i)).]
Summary: [C++] Make BufferedRowGroupWriter compatible with parquet
encryption (was: [C++] Missing column ordinal update when using buffered row
group writer)
> [C++] Make BufferedRowGroupWriter compatible with parquet encryption
> --------------------------------------------------------------------
>
> Key: PARQUET-1702
> URL: https://issues.apache.org/jira/browse/PARQUET-1702
> Project: Parquet
> Issue Type: Bug
> Components: parquet-cpp
> Affects Versions: cpp-1.6.0
> Reporter: Or Ozeri
> Priority: Major
> Labels: pull-request-available
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> The newly added parquet encryption feature currently works only with
> SerializedRowGroupWriter.
> There are several issues preventing the use of BufferedRowGroupWriter with
> encryption enabled:
> 1. Meta encryptor not passed on to ColumnChunkMetaDataBuilder::Finish. This
> can trigger a null-pointer dereference (reported as segmentation fault).
> 2. UpdateEncryption not called on Close, resulting in an incorrect AAD string
> when encrypting the column chunk metadata.
> 3. The column ordinal passed on to PageWriter::Open is always zero, resulting
> in a wrong AAD string when encrypting the columns data (except for the first
> column).
> 4. When decrypting a column chunk with no dictionary pages, PARQUET-1706
> confuses the decryptor to think it is decrypting a dictionary page, which
> again causes a wrong AAD string to be used when decrypting.
> We propose a patch (few dozen lines) to fix the above issues.
> We also extend the current parquet-encryption-test unit test, which tests
> SerializedRowGroupWriter, to test also with BufferedRowGroupWriter.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)