Re: [QUESTION] What type promotion actually means

Micah Kornfield Tue, 26 Aug 2025 10:13:40 -0700

I think the original question is ambiguous.  We should probably subset this
into two questions:


1.  Is it OK to write out an "int" instead of a "long" if the writer's
schema says the value is a long?

I think the answer here is we recommended not doing so, even though it
would likely work.

2. Is it OK to use an older schema for writing?

The consensus on the thread seems to be yes.  I'll note that this can cause
confusing results when the "write-default" [1] value for a column changes.
We should probably have an implementation note to clarify:
a.  Using a stale schema is allowed
b.  It might cause inconsistent results in the face of multiple writers
when default values are used.

Thoughts?

Thanks,
Micah

On Mon, Aug 25, 2025 at 4:59 PM Ryan Blue <[email protected]> wrote:

> I agree with Dan that type promotion should be well-defined. If it's a
> grey area then we should clarify it in the spec.
>
> How it works today is that schema evolution always produces a schema that
> can read files written with any older schema. When a type is promoted, the
> new schema can read any older data file, but readers may need to promote
> values like the [int-to-long reader](
> https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java#L546-L560)
> does. You aren't guaranteed to be able to read new data using an older
> schema, so the latest schema should always be used or you should use the
> schema attached to a snapshot.
>
> Because files with older schemas can always be read, it is safe to write
> files with an older schema. This happens fairly regularly, as Steven noted,
> in cases where a writer has a fixed schema and is long-running.
>
> Ryan
>
> On Thu, Aug 21, 2025 at 5:37 PM Steven Wu <[email protected]> wrote:
>
>> > This means that you can have writers using different schema to write
>> (use cases include different partitioning or "out-of-date" writers), but
>> the data is still valid.
>>
>> +1 on Dan's point. Both batch and streaming writers can have stale
>> schema. long-running streaming jobs may stay stale for extended periods
>> before picking up the new schema during restart.
>>
>> On Wed, Aug 20, 2025 at 2:50 PM Daniel Weeks <[email protected]> wrote:
>>
>>> I think I'm going to disagree and argue that it's not really a gray area.
>>>
>>> Having strict schema evolution rules and how schema's are tracked means
>>> that there is independence between writer and reader schemas which remain
>>> compatible due to the evolution rules.
>>>
>>> This means that you can have writers using different schema to write
>>> (use cases include different partitioning or "out-of-date" writers), but
>>> the data is still valid.
>>>
>>> How you promote physical representation during a read/scan operation
>>> results in a consistent presentation with the read schema.
>>>
>>> All of the representations are technically valid.
>>>
>>> -Dan
>>>
>>> On Mon, Aug 18, 2025 at 7:46 AM Russell Spitzer <
>>> [email protected]> wrote:
>>>
>>>> +1 to what Micah said :) sorry about the typo
>>>>
>>>> On Mon, Aug 18, 2025 at 9:45 AM Russell Spitzer <
>>>> [email protected]> wrote:
>>>>
>>>>> +1 to what Micaah , We have never really written rules about what is
>>>>> "allowed" in this particular context but since
>>>>> a reader needs to be able to handle both int/long values for the
>>>>> column, there isn't really any danger in writing
>>>>> new files with the narrower type. If a reader couldn't handle this,
>>>>> then type promotion would be impossible.
>>>>>
>>>>> I would include all columns in the file, the space requirements for an
>>>>> all null column (or all constant column) should
>>>>> be very small. I believe the reason we original wrote those rules in
>>>>> was to avoid folks doing the Hive Style
>>>>> implicit columns from partition tuple (although we also have handling
>>>>> for this.)
>>>>>
>>>>> On Sun, Aug 17, 2025 at 11:15 PM Micah Kornfield <
>>>>> [email protected]> wrote:
>>>>>
>>>>>>
>>>>>>  Hi Nic,
>>>>>> This is IMO a gray area.
>>>>>>
>>>>>> However, is it allowed to commit *new* parquet files with the old
>>>>>>> types (int) and commit them to the table with a table schema where
>>>>>>> types are promoted (long)?
>>>>>>
>>>>>>
>>>>>> IMO  I would expect writers to be writing files that are consistent
>>>>>> with the current metadata, so ideally they would not be written with int 
>>>>>> if
>>>>>> it is now long.  In general, though in these cases I think most readers 
>>>>>> are
>>>>>> robust to reading type promoted files.  We should probably clarify in the
>>>>>> specification.
>>>>>>
>>>>>>
>>>>>> Also, is it allowed to commit parquet files, in general, which contain
>>>>>>> only a subset of columns of table schema? I.e. if I know a column is
>>>>>>> all NULLs, can we just skip writing it?
>>>>>>
>>>>>>
>>>>>> As currently worded the spec on writing data files (
>>>>>> https://iceberg.apache.org/spec/#writing-data-files) should include
>>>>>> all columns. Based on column projection rules
>>>>>> <https://iceberg.apache.org/spec/#column-projection>, however,
>>>>>> failing to do so should also not cause problems.
>>>>>>
>>>>>> Cheers,
>>>>>> Micah
>>>>>>
>>>>>> On Fri, Aug 15, 2025 at 8:45 AM Nicolae Vartolomei
>>>>>> <[email protected]> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm implementing an Iceberg writer[^1] and have a question about what
>>>>>>> type promotion actually means as part of schema evolution rules.
>>>>>>>
>>>>>>> Iceberg spec [specifies][spec-evo] which type promotions are allowed.
>>>>>>> No confusion there.
>>>>>>>
>>>>>>> The confusion on my end arises when it comes to actually writing i.e.
>>>>>>> parquet data. Let's take for example the int to long promotion. What
>>>>>>> is actually allowed under this promotion rule? Let me try to show
>>>>>>> what
>>>>>>> I mean.
>>>>>>>
>>>>>>> Obviously if I have a schema-id N with field A of type int and table
>>>>>>> snapshots with this schema then it is possible to update the table
>>>>>>> schema-id to > N where field A now has type long and this new schema
>>>>>>> can read parquet files with the old type.
>>>>>>>
>>>>>>> However, is it allowed to commit *new* parquet files with the old
>>>>>>> types (int) and commit them to the table with a table schema where
>>>>>>> types are promoted (long)?
>>>>>>>
>>>>>>> Also, is it allowed to commit parquet files, in general, which
>>>>>>> contain
>>>>>>> only a subset of columns of table schema? I.e. if I know a column is
>>>>>>> all NULLs, can we just skip writing it?
>>>>>>>
>>>>>>> Appreciate taking the time to look at this,
>>>>>>> Nic
>>>>>>>
>>>>>>> [spec-evo]: https://iceberg.apache.org/spec/#schema-evolution
>>>>>>> [^1]: This is for Redpanda to Iceberg native integration
>>>>>>> (https://github.com/redpanda-data/redpanda).
>>>>>>>
>>>>>>

Re: [QUESTION] What type promotion actually means

Reply via email to