Re: [QUESTION] What type promotion actually means

Micah Kornfield Fri, 05 Sep 2025 00:56:10 -0700

Just to follow-up the PR ended up getting merged.  In theory, there maybe
should have been a vote, but unless someone feels strongly or would like to
fine tune the language further perhaps we can consider the topic resolved?


On Wed, Aug 27, 2025 at 2:17 PM Micah Kornfield <emkornfi...@gmail.com>
wrote:

> I opened https://github.com/apache/iceberg/pull/13936 as a draft proposal
> to capture the conversation.
>
> BTW, I think one area this brings up that I don't think the specification
> handles is changing between nullable and not-nullable fields. Outdated
> schemas have some implications in these cases as well.
>
> Cheers,
> Micah
>
> On Tue, Aug 26, 2025 at 10:13 AM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
>
>> I think the original question is ambiguous.  We should probably subset
>> this into two questions:
>>
>> 1.  Is it OK to write out an "int" instead of a "long" if the writer's
>> schema says the value is a long?
>>
>> I think the answer here is we recommended not doing so, even though it
>> would likely work.
>>
>> 2. Is it OK to use an older schema for writing?
>>
>> The consensus on the thread seems to be yes.  I'll note that this can
>> cause confusing results when the "write-default" [1] value for a column
>> changes.  We should probably have an implementation note to clarify:
>> a.  Using a stale schema is allowed
>> b.  It might cause inconsistent results in the face of multiple writers
>> when default values are used.
>>
>> Thoughts?
>>
>> Thanks,
>> Micah
>>
>> On Mon, Aug 25, 2025 at 4:59 PM Ryan Blue <rdb...@gmail.com> wrote:
>>
>>> I agree with Dan that type promotion should be well-defined. If it's a
>>> grey area then we should clarify it in the spec.
>>>
>>> How it works today is that schema evolution always produces a schema
>>> that can read files written with any older schema. When a type is promoted,
>>> the new schema can read any older data file, but readers may need to
>>> promote values like the [int-to-long reader](
>>> https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java#L546-L560)
>>> does. You aren't guaranteed to be able to read new data using an older
>>> schema, so the latest schema should always be used or you should use the
>>> schema attached to a snapshot.
>>>
>>> Because files with older schemas can always be read, it is safe to write
>>> files with an older schema. This happens fairly regularly, as Steven noted,
>>> in cases where a writer has a fixed schema and is long-running.
>>>
>>> Ryan
>>>
>>> On Thu, Aug 21, 2025 at 5:37 PM Steven Wu <stevenz...@gmail.com> wrote:
>>>
>>>> > This means that you can have writers using different schema to write
>>>> (use cases include different partitioning or "out-of-date" writers), but
>>>> the data is still valid.
>>>>
>>>> +1 on Dan's point. Both batch and streaming writers can have stale
>>>> schema. long-running streaming jobs may stay stale for extended periods
>>>> before picking up the new schema during restart.
>>>>
>>>> On Wed, Aug 20, 2025 at 2:50 PM Daniel Weeks <dwe...@apache.org> wrote:
>>>>
>>>>> I think I'm going to disagree and argue that it's not really a gray
>>>>> area.
>>>>>
>>>>> Having strict schema evolution rules and how schema's are tracked
>>>>> means that there is independence between writer and reader schemas which
>>>>> remain compatible due to the evolution rules.
>>>>>
>>>>> This means that you can have writers using different schema to write
>>>>> (use cases include different partitioning or "out-of-date" writers), but
>>>>> the data is still valid.
>>>>>
>>>>> How you promote physical representation during a read/scan operation
>>>>> results in a consistent presentation with the read schema.
>>>>>
>>>>> All of the representations are technically valid.
>>>>>
>>>>> -Dan
>>>>>
>>>>> On Mon, Aug 18, 2025 at 7:46 AM Russell Spitzer <
>>>>> russell.spit...@gmail.com> wrote:
>>>>>
>>>>>> +1 to what Micah said :) sorry about the typo
>>>>>>
>>>>>> On Mon, Aug 18, 2025 at 9:45 AM Russell Spitzer <
>>>>>> russell.spit...@gmail.com> wrote:
>>>>>>
>>>>>>> +1 to what Micaah , We have never really written rules about what is
>>>>>>> "allowed" in this particular context but since
>>>>>>> a reader needs to be able to handle both int/long values for the
>>>>>>> column, there isn't really any danger in writing
>>>>>>> new files with the narrower type. If a reader couldn't handle this,
>>>>>>> then type promotion would be impossible.
>>>>>>>
>>>>>>> I would include all columns in the file, the space requirements for
>>>>>>> an all null column (or all constant column) should
>>>>>>> be very small. I believe the reason we original wrote those rules in
>>>>>>> was to avoid folks doing the Hive Style
>>>>>>> implicit columns from partition tuple (although we also have
>>>>>>> handling for this.)
>>>>>>>
>>>>>>> On Sun, Aug 17, 2025 at 11:15 PM Micah Kornfield <
>>>>>>> emkornfi...@gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>  Hi Nic,
>>>>>>>> This is IMO a gray area.
>>>>>>>>
>>>>>>>> However, is it allowed to commit *new* parquet files with the old
>>>>>>>>> types (int) and commit them to the table with a table schema where
>>>>>>>>> types are promoted (long)?
>>>>>>>>
>>>>>>>>
>>>>>>>> IMO  I would expect writers to be writing files that are consistent
>>>>>>>> with the current metadata, so ideally they would not be written with 
>>>>>>>> int if
>>>>>>>> it is now long.  In general, though in these cases I think most 
>>>>>>>> readers are
>>>>>>>> robust to reading type promoted files.  We should probably clarify in 
>>>>>>>> the
>>>>>>>> specification.
>>>>>>>>
>>>>>>>>
>>>>>>>> Also, is it allowed to commit parquet files, in general, which
>>>>>>>>> contain
>>>>>>>>> only a subset of columns of table schema? I.e. if I know a column
>>>>>>>>> is
>>>>>>>>> all NULLs, can we just skip writing it?
>>>>>>>>
>>>>>>>>
>>>>>>>> As currently worded the spec on writing data files (
>>>>>>>> https://iceberg.apache.org/spec/#writing-data-files) should
>>>>>>>> include all columns. Based on column projection rules
>>>>>>>> <https://iceberg.apache.org/spec/#column-projection>, however,
>>>>>>>> failing to do so should also not cause problems.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Micah
>>>>>>>>
>>>>>>>> On Fri, Aug 15, 2025 at 8:45 AM Nicolae Vartolomei
>>>>>>>> <n...@nvartolomei.com.invalid> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm implementing an Iceberg writer[^1] and have a question about
>>>>>>>>> what
>>>>>>>>> type promotion actually means as part of schema evolution rules.
>>>>>>>>>
>>>>>>>>> Iceberg spec [specifies][spec-evo] which type promotions are
>>>>>>>>> allowed.
>>>>>>>>> No confusion there.
>>>>>>>>>
>>>>>>>>> The confusion on my end arises when it comes to actually writing
>>>>>>>>> i.e.
>>>>>>>>> parquet data. Let's take for example the int to long promotion.
>>>>>>>>> What
>>>>>>>>> is actually allowed under this promotion rule? Let me try to show
>>>>>>>>> what
>>>>>>>>> I mean.
>>>>>>>>>
>>>>>>>>> Obviously if I have a schema-id N with field A of type int and
>>>>>>>>> table
>>>>>>>>> snapshots with this schema then it is possible to update the table
>>>>>>>>> schema-id to > N where field A now has type long and this new
>>>>>>>>> schema
>>>>>>>>> can read parquet files with the old type.
>>>>>>>>>
>>>>>>>>> However, is it allowed to commit *new* parquet files with the old
>>>>>>>>> types (int) and commit them to the table with a table schema where
>>>>>>>>> types are promoted (long)?
>>>>>>>>>
>>>>>>>>> Also, is it allowed to commit parquet files, in general, which
>>>>>>>>> contain
>>>>>>>>> only a subset of columns of table schema? I.e. if I know a column
>>>>>>>>> is
>>>>>>>>> all NULLs, can we just skip writing it?
>>>>>>>>>
>>>>>>>>> Appreciate taking the time to look at this,
>>>>>>>>> Nic
>>>>>>>>>
>>>>>>>>> [spec-evo]: https://iceberg.apache.org/spec/#schema-evolution
>>>>>>>>> [^1]: This is for Redpanda to Iceberg native integration
>>>>>>>>> (https://github.com/redpanda-data/redpanda).
>>>>>>>>>
>>>>>>>>

Re: [QUESTION] What type promotion actually means

Reply via email to