11/24/2020
Hi all,
Attendees:
1.
To solve Parquet upgrading with Avro version issue, should we release
Parquet Avro with a separate release?
1.
For uprading Avro from1.8 to 1.9, Parquet only have unit test change
and parquet-cli and user can excluce avro from Parquet
2.
The long-term still benefits if we can separate but it is not easy,
for now, it is not required.
2.
Column Encryption
1.
C++ version has several PRs (improvements) recently.
3.
Data masking
1.
Some upper layer can develop their own data masking easily.
2.
We might think about some simple tools other than executing them in
Parquet.
3.
Developed null data masking in Parquet and it works now. Open a Google
doc and we can discuss from there.
1.
Parquet 1.11.x adoption to Presto
1.
PR <https://github.com/prestodb/presto/pull/14960> is created but it
has a unit test failure.
2.
Parquet 1.11.x feature adoption to Iceberg
1.
Iceberg meeting notes
<https://docs.google.com/document/d/1YuGhUdukLP5gGiqCbk0A5_Wifqe2CZWgOd3TbhY3UQg/edit#>
for discussing this issue.
2.
Issue summary and proposals
<https://docs.google.com/document/d/1f8erGSnhVcdD0UokGx2opjmGvCU69g7fsiPXCJhP3MA/edit#>
3.
For having Parquet V2 API to support Iceberg, if we do that, then
makes sense to have vectorized API with Parquet V2 API. Let’s bring other
PMS/commuters to discuss for the next community meeting.
3.
Parquet 1.12.0
a. Will cut RC release soon
Please let me know if you have any questions.
Xinli Shang | Tech Lead Manager @ Uber Data Infra
--
Xinli Shang