Hi Anoop,

Thanks a lot for the initial review.

Data correctness guards:
1. I will add support for Remove action soon, work on the PR is in progress.
2. Sure, let's do reject for `column mapping` feature for now for the
safety. Later I will try to provide support of this feature as well.


Yes, the PR depends on `*internal*` API of the delta-kernel. I do not see a
simple way to replace it with the public API.  As an option I can replace
these classes with our `in-house` classes that would rely on the Dela
protocol spec, it will be safe in terms of runtime but it will be
additional code that we will need to support.

What do you think if I will continue work with `*internal*` delta API for
now and refactor this logic before merging the PR once we will agree on
some solutions?


On Tue, Feb 24, 2026 at 5:29 AM Anoop Johnson <[email protected]> wrote:

> Hi, Vladislav -
>
> I've done an initial review of the PR
> <https://github.com/apache/iceberg/pull/15407>. Moving to the Delta
> kernel is the right direction, so thank you for doing this. Here's a
> summary of my initial feedback (full details are in the PR):
>
> Data correctness guards:
> 1. If we encounter `Remove` actions, it should fail fast rather than
> silently skip it. Otherwise tables with DML will produce duplicate rows in
> the Iceberg table.
> 2. Tables with column mapping enabled) will produce silent data corruption
> because the Parquet files will have physical column names that don't match
> the logical schema. We should validate this and reject until column mapping
> support is added (which can be done as a separate PR).
>
> The PR relies heavily on io.delta.kernel.internal.* classes, which can be
> fragile. We should consider replacing them with the public kernel APIs.
>
> Best,
> Anoop
>
>
> On Mon, Feb 23, 2026 at 12:29 AM Vladislav Sidorovich via dev <
> [email protected]> wrote:
>
>> Hi Iceberg Community,
>>
>> I recently opened a PR to update the existing Delta Lake to Iceberg
>> migration functionality to support recent Delta Lake table versions (read:
>> 3, write: 7). I would appreciate it if anyone take a look and share
>> thoughts on the architecture and initial implementation
>>
>> *PR Link:* https://github.com/apache/iceberg/pull/15407
>>
>> The main motivation for sharing this now is to get some early feedback
>> from the community on the approach and the initial implementation.
>>
>> To make reviewing easier, this PR doesn't remove or overwrite the old
>> logic. Instead, I’ve added a new interface implementation utilizing the 
>> *Delta
>> Lake Kernel library* (replacing the deprecated Delta Lake standalone
>> library). This side-by-side approach allows for easier comparison and
>> shouldn't introduce any issues with current usage scenarios.
>>
>>
>> *Current PR Scope:*
>>
>>    - Maintains support for the existing migration interface.
>>    - Migrates the underlying engine to the Delta Lake Kernel library.
>>    - Contains the basic migration flow.
>>    - Successfully converts all data types, table schemas, and partition
>>    specs.
>>    - Currently supports INSERT operations only (Delta Lake Add action).
>>    - *Testing:* Includes unit tests for all supported data types
>>    (including complex arrays and structures) and integration tests for
>>    insert-only scenarios using Spark 3.5.
>>
>> *Future Steps (Next PRs):*
>>
>> Once we align on this foundation, I plan to follow up with:
>>
>>    - Adding support for UPDATE and DELETE (Delta Lake Remove action).
>>    - Supporting all remaining Delta Lake actions.
>>    - Handling edge cases for partitions and generated columns.
>>    - Adding Schema Evolution support.
>>    - Adding Deletion Vector (DV) support.
>>    - Enabling Incremental Conversion (from/to specific Delta versions).
>>    - Adding all tables from the Delta golden tables for robust testing. 
>> *(Note:
>>    The current integration test will be updated for newer Delta Lake versions
>>    once the old standalone solution is fully deprecated/deleted).*
>>
>>
>> --
>> Best regards,
>> Vladislav Sidorovich
>>
>> Feedback: *go/feedback-for-vladislav
>> <https://goto.google.com/feedback-for-vladislav> *
>> [image: Google Logo]
>>
>>
>>

-- 
Best regards,
Vladislav Sidorovich

Feedback: *go/feedback-for-vladislav *
[image: Google Logo]

Reply via email to