haroldjimenez opened a new issue, #16217:
URL: https://github.com/apache/iceberg/issues/16217

   ### Apache Iceberg version
   
   1.10.1 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   Environment
   
   Iceberg version: 1.10.1
   Spark version: 3.5.8
   Catalog: spark_catalog with Hive Metastore (default)
   
   Describe the bug
   
   After performing the correct sequence of DROP PARTITION FIELD followed by 
DROP COLUMN on an identity partition field, the table enters an unrecoverable 
state with two cascading failures:
   
   Querying .partitions metadata table throws a ValidationException
   Re-adding the dropped column with the same name throws Cannot create 
identity partition sourced from different field in schema
   CALL system.rewrite_manifests() also fails — there is no procedure-based 
recovery path
   
   The table can only be recovered by dropping and recreating it entirely.
   Note: DROP PARTITION FIELD alone works correctly. The issue only occurs when 
DROP COLUMN follows.
   
   
   Steps to reproduce
   
   ```java
   -- 1. Create table with identity partition
   CREATE TABLE spark_catalog.default.test_table (
       event_id    BIGINT,
       event_date  DATE,
       event_hour  INT,
       user_id     STRING
   )
   USING iceberg
   PARTITIONED BY (event_date);
   
   -- 2. Insert initial data under spec 0 (event_date only)
   INSERT INTO spark_catalog.default.test_table VALUES
       (1, DATE '2024-03-14', 9,  'user_A'),
       (2, DATE '2024-03-15', 10, 'user_B');
   
   -- 3. Add event_hour as identity partition field (spec 1)
   ALTER TABLE spark_catalog.default.test_table
   ADD PARTITION FIELD event_hour;
   
   -- 4. Insert data under spec 1 (event_date + event_hour)
   INSERT INTO spark_catalog.default.test_table VALUES
       (3, DATE '2024-03-16', 14, 'user_C'),
       (4, DATE '2024-03-16', 20, 'user_D');
   
   -- 5. Drop the partition field (works fine)
   ALTER TABLE spark_catalog.default.test_table
   DROP PARTITION FIELD event_hour;
   
   -- 6. Drop the source column (succeeds with no error)
   ALTER TABLE spark_catalog.default.test_table
   DROP COLUMN event_hour;
   
   -- 7. Query partitions metadata → CRASH #1
   SELECT * FROM spark_catalog.default.test_table.partitions;
   
   -- 8. Try to re-add the column → CRASH #2
   ALTER TABLE spark_catalog.default.test_table
   ADD COLUMN event_hour INT;
   
   -- 9. Try to recover via rewrite_manifests → CRASH #3
   CALL spark_catalog.system.rewrite_manifests(
       table => 'spark_catalog.default.test_table'
   );
   ``` 
   
   Expected behavior
   
   * DROP COLUMN should either be blocked with a clear error if old partition 
specs still reference the column in manifest history, OR the metadata cleanup 
should handle the column removal gracefully so that .partitions remains 
queryable and the column can be re-added later.
   
   Related PR
   * https://github.com/apache/iceberg/pull/14261
   
   
   Actual behavior
   Crash #1 — querying .partitions after DROP COLUMN:
   ```java
   org.apache.iceberg.exceptions.ValidationException: Cannot find source column 
for partition field: 1001: event_hour: identity(1001)
       at 
org.apache.iceberg.PartitionSpec.checkCompatibility(PartitionSpec.java:661)
       at org.apache.iceberg.PartitionSpec$Builder.build(PartitionSpec.java:633)
       at 
org.apache.iceberg.BaseMetadataTable.transformSpec(BaseMetadataTable.java:83)
       at 
org.apache.iceberg.PartitionsTable.lambda$filteredManifests$4(PartitionsTable.java:230)
   ``` 
   
   Crash #2 — re-adding the column:
   ```java
   java.lang.IllegalArgumentException: Cannot create identity partition sourced 
from different field in schema: event_hour
       at 
org.apache.iceberg.PartitionSpec$Builder.checkAndAddPartitionName(PartitionSpec.java:413)
       at 
org.apache.iceberg.TableMetadata.updateSpecSchema(TableMetadata.java:759)
   ``` 
   
   Crash #3 — rewrite_manifests also fails with the same ValidationException as 
Crash #1, leaving no procedure-based recovery path.
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to