Re: [QUESTION] What type promotion actually means

Micah Kornfield Wed, 27 Aug 2025 14:18:01 -0700

I opened https://github.com/apache/iceberg/pull/13936 as a draft proposal
to capture the conversation.


BTW, I think one area this brings up that I don't think the specification
handles is changing between nullable and not-nullable fields. Outdated
schemas have some implications in these cases as well.

Cheers,
Micah

On Tue, Aug 26, 2025 at 10:13 AM Micah Kornfield <[email protected]>
wrote:

> I think the original question is ambiguous.  We should probably subset
> this into two questions:
>
> 1.  Is it OK to write out an "int" instead of a "long" if the writer's
> schema says the value is a long?
>
> I think the answer here is we recommended not doing so, even though it
> would likely work.
>
> 2. Is it OK to use an older schema for writing?
>
> The consensus on the thread seems to be yes.  I'll note that this can
> cause confusing results when the "write-default" [1] value for a column
> changes.  We should probably have an implementation note to clarify:
> a.  Using a stale schema is allowed
> b.  It might cause inconsistent results in the face of multiple writers
> when default values are used.
>
> Thoughts?
>
> Thanks,
> Micah
>
> On Mon, Aug 25, 2025 at 4:59 PM Ryan Blue <[email protected]> wrote:
>
>> I agree with Dan that type promotion should be well-defined. If it's a
>> grey area then we should clarify it in the spec.
>>
>> How it works today is that schema evolution always produces a schema that
>> can read files written with any older schema. When a type is promoted, the
>> new schema can read any older data file, but readers may need to promote
>> values like the [int-to-long reader](
>> https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java#L546-L560)
>> does. You aren't guaranteed to be able to read new data using an older
>> schema, so the latest schema should always be used or you should use the
>> schema attached to a snapshot.
>>
>> Because files with older schemas can always be read, it is safe to write
>> files with an older schema. This happens fairly regularly, as Steven noted,
>> in cases where a writer has a fixed schema and is long-running.
>>
>> Ryan
>>
>> On Thu, Aug 21, 2025 at 5:37 PM Steven Wu <[email protected]> wrote:
>>
>>> > This means that you can have writers using different schema to write
>>> (use cases include different partitioning or "out-of-date" writers), but
>>> the data is still valid.
>>>
>>> +1 on Dan's point. Both batch and streaming writers can have stale
>>> schema. long-running streaming jobs may stay stale for extended periods
>>> before picking up the new schema during restart.
>>>
>>> On Wed, Aug 20, 2025 at 2:50 PM Daniel Weeks <[email protected]> wrote:
>>>
>>>> I think I'm going to disagree and argue that it's not really a gray
>>>> area.
>>>>
>>>> Having strict schema evolution rules and how schema's are tracked means
>>>> that there is independence between writer and reader schemas which remain
>>>> compatible due to the evolution rules.
>>>>
>>>> This means that you can have writers using different schema to write
>>>> (use cases include different partitioning or "out-of-date" writers), but
>>>> the data is still valid.
>>>>
>>>> How you promote physical representation during a read/scan operation
>>>> results in a consistent presentation with the read schema.
>>>>
>>>> All of the representations are technically valid.
>>>>
>>>> -Dan
>>>>
>>>> On Mon, Aug 18, 2025 at 7:46 AM Russell Spitzer <
>>>> [email protected]> wrote:
>>>>
>>>>> +1 to what Micah said :) sorry about the typo
>>>>>
>>>>> On Mon, Aug 18, 2025 at 9:45 AM Russell Spitzer <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> +1 to what Micaah , We have never really written rules about what is
>>>>>> "allowed" in this particular context but since
>>>>>> a reader needs to be able to handle both int/long values for the
>>>>>> column, there isn't really any danger in writing
>>>>>> new files with the narrower type. If a reader couldn't handle this,
>>>>>> then type promotion would be impossible.
>>>>>>
>>>>>> I would include all columns in the file, the space requirements for
>>>>>> an all null column (or all constant column) should
>>>>>> be very small. I believe the reason we original wrote those rules in
>>>>>> was to avoid folks doing the Hive Style
>>>>>> implicit columns from partition tuple (although we also have handling
>>>>>> for this.)
>>>>>>
>>>>>> On Sun, Aug 17, 2025 at 11:15 PM Micah Kornfield <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>>
>>>>>>>  Hi Nic,
>>>>>>> This is IMO a gray area.
>>>>>>>
>>>>>>> However, is it allowed to commit *new* parquet files with the old
>>>>>>>> types (int) and commit them to the table with a table schema where
>>>>>>>> types are promoted (long)?
>>>>>>>
>>>>>>>
>>>>>>> IMO  I would expect writers to be writing files that are consistent
>>>>>>> with the current metadata, so ideally they would not be written with 
>>>>>>> int if
>>>>>>> it is now long.  In general, though in these cases I think most readers 
>>>>>>> are
>>>>>>> robust to reading type promoted files.  We should probably clarify in 
>>>>>>> the
>>>>>>> specification.
>>>>>>>
>>>>>>>
>>>>>>> Also, is it allowed to commit parquet files, in general, which
>>>>>>>> contain
>>>>>>>> only a subset of columns of table schema? I.e. if I know a column is
>>>>>>>> all NULLs, can we just skip writing it?
>>>>>>>
>>>>>>>
>>>>>>> As currently worded the spec on writing data files (
>>>>>>> https://iceberg.apache.org/spec/#writing-data-files) should include
>>>>>>> all columns. Based on column projection rules
>>>>>>> <https://iceberg.apache.org/spec/#column-projection>, however,
>>>>>>> failing to do so should also not cause problems.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Micah
>>>>>>>
>>>>>>> On Fri, Aug 15, 2025 at 8:45 AM Nicolae Vartolomei
>>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm implementing an Iceberg writer[^1] and have a question about
>>>>>>>> what
>>>>>>>> type promotion actually means as part of schema evolution rules.
>>>>>>>>
>>>>>>>> Iceberg spec [specifies][spec-evo] which type promotions are
>>>>>>>> allowed.
>>>>>>>> No confusion there.
>>>>>>>>
>>>>>>>> The confusion on my end arises when it comes to actually writing
>>>>>>>> i.e.
>>>>>>>> parquet data. Let's take for example the int to long promotion. What
>>>>>>>> is actually allowed under this promotion rule? Let me try to show
>>>>>>>> what
>>>>>>>> I mean.
>>>>>>>>
>>>>>>>> Obviously if I have a schema-id N with field A of type int and table
>>>>>>>> snapshots with this schema then it is possible to update the table
>>>>>>>> schema-id to > N where field A now has type long and this new schema
>>>>>>>> can read parquet files with the old type.
>>>>>>>>
>>>>>>>> However, is it allowed to commit *new* parquet files with the old
>>>>>>>> types (int) and commit them to the table with a table schema where
>>>>>>>> types are promoted (long)?
>>>>>>>>
>>>>>>>> Also, is it allowed to commit parquet files, in general, which
>>>>>>>> contain
>>>>>>>> only a subset of columns of table schema? I.e. if I know a column is
>>>>>>>> all NULLs, can we just skip writing it?
>>>>>>>>
>>>>>>>> Appreciate taking the time to look at this,
>>>>>>>> Nic
>>>>>>>>
>>>>>>>> [spec-evo]: https://iceberg.apache.org/spec/#schema-evolution
>>>>>>>> [^1]: This is for Redpanda to Iceberg native integration
>>>>>>>> (https://github.com/redpanda-data/redpanda).
>>>>>>>>
>>>>>>>

Re: [QUESTION] What type promotion actually means

Reply via email to