Re: [QUESTION] What type promotion actually means

Nicolae Vartolomei Fri, 05 Sep 2025 21:11:07 -0700

Quoting the newly added text in the spec:

> Writers must write out all fields with the types specified from a schema 
> present in table metadata. Writers should use the latest schema for writing. 
> Not writing out all columns or not using the latest schema can change the 
> semantics of the data written. The following are possible inconsistencies 
> that can be introduced:


Interpreting this for the Iceberg REST Catalog interaction: writers
should include the assert-current-schema-id requirement when
committing to a table. Otherwise, inconsistencies may be introduced. I
might be on a slightly outdated Iceberg build, but from what I can
see, Spark with Iceberg doesn’t include this requirement during table
commits (example:
https://gist.github.com/nvartolomei/2fd6e994d1d9b5d61597c16339100518).

Do we consider this an implementation bug then?

Also, what should a sensible implementation do when the requirement
fails? Reload table schema and check if new schema is compatible with
writer schema (i.e. it wouldn't be subject to the possible
inconsistencies as described by the spec) and otherwise rewrite the
data? For example rewrite parquet files by adding new all-null fields
if needed, etc.

On Thu, Sep 4, 2025 at 10:19 PM Micah Kornfield <emkornfi...@gmail.com> wrote:
>
> Just to follow-up the PR ended up getting merged.  In theory, there maybe 
> should have been a vote, but unless someone feels strongly or would like to 
> fine tune the language further perhaps we can consider the topic resolved?
>
> On Wed, Aug 27, 2025 at 2:17 PM Micah Kornfield <emkornfi...@gmail.com> wrote:
>>
>> I opened https://github.com/apache/iceberg/pull/13936 as a draft proposal to 
>> capture the conversation.
>>
>> BTW, I think one area this brings up that I don't think the specification 
>> handles is changing between nullable and not-nullable fields. Outdated 
>> schemas have some implications in these cases as well.
>>
>> Cheers,
>> Micah
>>
>> On Tue, Aug 26, 2025 at 10:13 AM Micah Kornfield <emkornfi...@gmail.com> 
>> wrote:
>>>
>>> I think the original question is ambiguous.  We should probably subset this 
>>> into two questions:
>>>
>>> 1.  Is it OK to write out an "int" instead of a "long" if the writer's 
>>> schema says the value is a long?
>>>
>>> I think the answer here is we recommended not doing so, even though it 
>>> would likely work.
>>>
>>> 2. Is it OK to use an older schema for writing?
>>>
>>> The consensus on the thread seems to be yes.  I'll note that this can cause 
>>> confusing results when the "write-default" [1] value for a column changes.  
>>> We should probably have an implementation note to clarify:
>>> a.  Using a stale schema is allowed
>>> b.  It might cause inconsistent results in the face of multiple writers 
>>> when default values are used.
>>>
>>> Thoughts?
>>>
>>> Thanks,
>>> Micah
>>>
>>> On Mon, Aug 25, 2025 at 4:59 PM Ryan Blue <rdb...@gmail.com> wrote:
>>>>
>>>> I agree with Dan that type promotion should be well-defined. If it's a 
>>>> grey area then we should clarify it in the spec.
>>>>
>>>> How it works today is that schema evolution always produces a schema that 
>>>> can read files written with any older schema. When a type is promoted, the 
>>>> new schema can read any older data file, but readers may need to promote 
>>>> values like the [int-to-long 
>>>> reader](https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java#L546-L560)
>>>>  does. You aren't guaranteed to be able to read new data using an older 
>>>> schema, so the latest schema should always be used or you should use the 
>>>> schema attached to a snapshot.
>>>>
>>>> Because files with older schemas can always be read, it is safe to write 
>>>> files with an older schema. This happens fairly regularly, as Steven 
>>>> noted, in cases where a writer has a fixed schema and is long-running.
>>>>
>>>> Ryan
>>>>
>>>> On Thu, Aug 21, 2025 at 5:37 PM Steven Wu <stevenz...@gmail.com> wrote:
>>>>>
>>>>> > This means that you can have writers using different schema to write 
>>>>> > (use cases include different partitioning or "out-of-date" writers), 
>>>>> > but the data is still valid.
>>>>>
>>>>> +1 on Dan's point. Both batch and streaming writers can have stale 
>>>>> schema. long-running streaming jobs may stay stale for extended periods 
>>>>> before picking up the new schema during restart.
>>>>>
>>>>> On Wed, Aug 20, 2025 at 2:50 PM Daniel Weeks <dwe...@apache.org> wrote:
>>>>>>
>>>>>> I think I'm going to disagree and argue that it's not really a gray area.
>>>>>>
>>>>>> Having strict schema evolution rules and how schema's are tracked means 
>>>>>> that there is independence between writer and reader schemas which 
>>>>>> remain compatible due to the evolution rules.
>>>>>>
>>>>>> This means that you can have writers using different schema to write 
>>>>>> (use cases include different partitioning or "out-of-date" writers), but 
>>>>>> the data is still valid.
>>>>>>
>>>>>> How you promote physical representation during a read/scan operation 
>>>>>> results in a consistent presentation with the read schema.
>>>>>>
>>>>>> All of the representations are technically valid.
>>>>>>
>>>>>> -Dan
>>>>>>
>>>>>> On Mon, Aug 18, 2025 at 7:46 AM Russell Spitzer 
>>>>>> <russell.spit...@gmail.com> wrote:
>>>>>>>
>>>>>>> +1 to what Micah said :) sorry about the typo
>>>>>>>
>>>>>>> On Mon, Aug 18, 2025 at 9:45 AM Russell Spitzer 
>>>>>>> <russell.spit...@gmail.com> wrote:
>>>>>>>>
>>>>>>>> +1 to what Micaah , We have never really written rules about what is 
>>>>>>>> "allowed" in this particular context but since
>>>>>>>> a reader needs to be able to handle both int/long values for the 
>>>>>>>> column, there isn't really any danger in writing
>>>>>>>> new files with the narrower type. If a reader couldn't handle this, 
>>>>>>>> then type promotion would be impossible.
>>>>>>>>
>>>>>>>> I would include all columns in the file, the space requirements for an 
>>>>>>>> all null column (or all constant column) should
>>>>>>>> be very small. I believe the reason we original wrote those rules in 
>>>>>>>> was to avoid folks doing the Hive Style
>>>>>>>> implicit columns from partition tuple (although we also have handling 
>>>>>>>> for this.)
>>>>>>>>
>>>>>>>> On Sun, Aug 17, 2025 at 11:15 PM Micah Kornfield 
>>>>>>>> <emkornfi...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  Hi Nic,
>>>>>>>>> This is IMO a gray area.
>>>>>>>>>
>>>>>>>>>> However, is it allowed to commit *new* parquet files with the old
>>>>>>>>>> types (int) and commit them to the table with a table schema where
>>>>>>>>>> types are promoted (long)?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> IMO  I would expect writers to be writing files that are consistent 
>>>>>>>>> with the current metadata, so ideally they would not be written with 
>>>>>>>>> int if it is now long.  In general, though in these cases I think 
>>>>>>>>> most readers are robust to reading type promoted files.  We should 
>>>>>>>>> probably clarify in the specification.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Also, is it allowed to commit parquet files, in general, which 
>>>>>>>>>> contain
>>>>>>>>>> only a subset of columns of table schema? I.e. if I know a column is
>>>>>>>>>> all NULLs, can we just skip writing it?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> As currently worded the spec on writing data files 
>>>>>>>>> (https://iceberg.apache.org/spec/#writing-data-files) should include 
>>>>>>>>> all columns. Based on column projection rules, however, failing to do 
>>>>>>>>> so should also not cause problems.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Micah
>>>>>>>>>
>>>>>>>>> On Fri, Aug 15, 2025 at 8:45 AM Nicolae Vartolomei 
>>>>>>>>> <n...@nvartolomei.com.invalid> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm implementing an Iceberg writer[^1] and have a question about what
>>>>>>>>>> type promotion actually means as part of schema evolution rules.
>>>>>>>>>>
>>>>>>>>>> Iceberg spec [specifies][spec-evo] which type promotions are allowed.
>>>>>>>>>> No confusion there.
>>>>>>>>>>
>>>>>>>>>> The confusion on my end arises when it comes to actually writing i.e.
>>>>>>>>>> parquet data. Let's take for example the int to long promotion. What
>>>>>>>>>> is actually allowed under this promotion rule? Let me try to show 
>>>>>>>>>> what
>>>>>>>>>> I mean.
>>>>>>>>>>
>>>>>>>>>> Obviously if I have a schema-id N with field A of type int and table
>>>>>>>>>> snapshots with this schema then it is possible to update the table
>>>>>>>>>> schema-id to > N where field A now has type long and this new schema
>>>>>>>>>> can read parquet files with the old type.
>>>>>>>>>>
>>>>>>>>>> However, is it allowed to commit *new* parquet files with the old
>>>>>>>>>> types (int) and commit them to the table with a table schema where
>>>>>>>>>> types are promoted (long)?
>>>>>>>>>>
>>>>>>>>>> Also, is it allowed to commit parquet files, in general, which 
>>>>>>>>>> contain
>>>>>>>>>> only a subset of columns of table schema? I.e. if I know a column is
>>>>>>>>>> all NULLs, can we just skip writing it?
>>>>>>>>>>
>>>>>>>>>> Appreciate taking the time to look at this,
>>>>>>>>>> Nic
>>>>>>>>>>
>>>>>>>>>> [spec-evo]: https://iceberg.apache.org/spec/#schema-evolution
>>>>>>>>>> [^1]: This is for Redpanda to Iceberg native integration
>>>>>>>>>> (https://github.com/redpanda-data/redpanda).

Re: [QUESTION] What type promotion actually means

Reply via email to