Attendees: Gidon, Gabor, Fokko, Xu, Sri, Xinli


   1.

   Column Encryption
   1.

      PR 800 <https://github.com/apache/parquet-mr/pull/800> - This is to
      merge to master and it is being reviewed.
      1.

         One comment is about CRC. Since the encryption algorithm AES-GCM
         already has an integration check, doing CRC is redundant.
         2.

         The behavior “CRC is enabled by default in writing path” will not
         be changed even when AES-GCM is used. This is because CRC calculation
         overhead is very small according to our earlier tests, and changing
         behavior may break something.
         2.

      The PR <https://github.com/apache/parquet-mr/pull/801> for
      Parquet-1396 will be moved to the master branch after PR 800
      <https://github.com/apache/parquet-mr/pull/800> is done.
      2.

   Parquet 1.11.1 release.
   1.

      Additional fix(Parquet-1684
      <https://issues.apache.org/jira/browse/PARQUET-1684>) to be added?
      The conclusion is no after the discussion. This is not a regression in
      Parquet11 and the change itself is not low risk.
      2.

      Rolling out the Spark is still blocked. But downgrading the Avro
      version in Parquet is not an option.
      3.

   Parquet 12 release
   1.

      After encryption is done, Gabor will create a Jira to start the
      process.
      4.

   Proposal for CompressionCodec Provider-aware Compression Codec (doc
   
<https://docs.google.com/document/d/1ueSYq2FIzaom23cpHXppig93ylOxe8CU6EwS82dov2E/edit#heading=h.5b2qz2ba32wm>
   )
   1.

      PR-803 <https://github.com/apache/parquet-mr/pull/803> need to be
      reviewed
      5.

   Data masking
   1.

      After column encryption is done in master, we(Xinli, Gidon, and Sri)
      will start the conversation.


Please let me know if you have any questions.

---

Xinli | Uber Data Infra Team

Reply via email to