Hi Iceberg Community,

Here are the minutes and recording from our Iceberg Sync that took
place on *December
21*.

Always remember, anyone can join the discussion so feel free to share the
Iceberg-Sync <https://groups.google.com/g/iceberg-sync> google group with
anyone seeking an invite.
The notes and the agenda are posted in the Iceberg Sync doc
<https://docs.google.com/document/d/1YuGhUdukLP5gGiqCbk0A5_Wifqe2CZWgOd3TbhY3UQg/edit?usp=drive_web>
that's
also attached to the meeting invitation and it's an excellent place to add
items as you see fit so we can discuss them in the following community sync.

Meeting Recording
<https://drive.google.com/file/d/1MICV3d8Xa1c17WI6qU_Nni1pS_fJQBAG/view?usp=sharing>
⭕

Meeting Transcript
<https://docs.google.com/document/d/1FmL9qfQ_MZPwANfn-Hte0RO8Y9J2ADPDuTTvnU2Gc_0/edit?usp=sharing>

   -

   *Highlights*
   -

      Added basic commit metrics (Thanks, Eduard!)
      -

      Added time range query for Spark changelog tables (Thanks, Yufei!)
      -

      Added Azure FileIO in Python (Thanks, Eric!)
      -

      Added support for FileIO to ORC (Thanks, Pavan!)
      -

      Python 0.2.0 release is out! (Thanks, Fokko and Jun!)
      -

      Added view interfaces (Thanks, John!)
      -

      Added remaining transforms as Spark UDFs (Thanks, Anton!)
      -

   *Releases*
   -

      Python 0.2.1
      -

         Partition date comparison bug
         -

         PyArrow hard dependency
         -

         Fixing the comparison bug right now, maybe end of week
         -

      Python 0.3.0
      -

         ID-based column projection (support renames, etc.)
         -

         Probably January
         -

      1.2.0 – January/February
      -

         Szehon’s delete metadata table for delete file compaction
         -

         Branch commits for operations other than append and delete
         -

         Vectorized Arrow read path fix for dictionary-encoded values
         -

   *Discussion*
   -

      Discussion on table location ownership seems to stop without reaching
      consensus. https://github.com/apache/iceberg/issues/4159
      Is there anyone working on it actively? If there is interest to work
      on it, who could give guidance?
      -

         Not directly related but some parts of the relative paths design
         here
         
<https://docs.google.com/document/u/0/d/1RDEjJAVEXg1csRzyzTuM634L88vvI0iDHNQQK3kOVR0/edit>
         introduced the notion of locations owned by the table. I will
work on this
         early next year, happy to collaborate with you.
         -

      Multi-table transactions
      -

         Isolation levels - snapshot and serializable
         -

         Need 3 levels
         -

      SPJ results


Thanks everyone!

Reply via email to