Thank you for looking into it. This sounds very similar to Arrow's usage of field_id (passing it through but making no business decisions). I appreciate the advice and I have reached out to the Iceberg team.
On Tue, May 18, 2021 at 12:42 AM Gabor Szadovszky <[email protected]> wrote: > > Hi Weston, > > I've quickly searched our source code and the format about field_id because > I have no experience with it. It seems that we write/read field_id between > the file footer and our internal schema structure. You are also able to set > field id when you either use our schema builder or the schema parser. We > also support field_id during the thrift and parquet schema conversion. > Meanwhile, it seems we do not make any business decision based on field_id. > For example neither schema merge nor filtering support field ids. > > So, I would not say Parquet is the best example (for now) to help you with > field ids. Maybe, Apache Iceberg <https://iceberg.apache.org/spec/> would > be a better one since their specification explains it (unlike Parquet). > > Regards, > Gabor > > On Tue, May 18, 2021 at 6:20 AM Weston Pace <[email protected]> wrote: > > > Hi dev, > > > > I'm Weston, I've been working on the Arrow project lately. As the > > Arrow project implements more transformations of data I've been > > wondering how we should treat the field_id property. For some > > concrete examples: > > > > * Filtering a table by column (it seems the field_id should remain > > unchanged) > > * Filtering a table by rows (it seems the field_id should remain > > unchanged) > > * Filling in null values with a placeholder value (the data is changed so > > ???) > > * Casting a field to a different data type (the meaning of the data > > has changed so ???) > > * Combining two fields into a third field (it seems the third field > > should have no field_id) > > > > I'm reaching out to the Parquet community to solicit input as you have > > expertise/experience around the motivation behind the field_id > > property and its uses. > > > > Thanks, > > > > -Weston Pace > >
