[ https://issues.apache.org/jira/browse/HIVE-29190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18019592#comment-18019592 ]
Denys Kuzmenko commented on HIVE-29190: --------------------------------------- Merged to master Thanks for the fix, [~ayushtkn]! > Iceberg: [V3] Fix handling of Delete/Update with DV's > ----------------------------------------------------- > > Key: HIVE-29190 > URL: https://issues.apache.org/jira/browse/HIVE-29190 > Project: Hive > Issue Type: Bug > Reporter: Ayush Saxena > Priority: Major > Labels: pull-request-available > > Currently if we try to delete or update on a V3 table. If the DataFile being > operated already has a DeleteVector, The subsequent queries fail. > {noformat} > org.apache.hadoop.hive.ql.exec.tez.TezRuntimeException: Vertex failed, > vertexName=Map 1, vertexId=vertex_1757524880780_0001_5_00, > diagnostics=[Vertex vertex_1757524880780_0001_5_00 [Map 1] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ice01 initializer failed, > vertex=vertex_1757524880780_0001_5_00 [Map 1], > org.apache.iceberg.exceptions.ValidationException: Can't index multiple DVs > for > hdfs://localhost:51198/build/ql/test/data/warehouse/ice01/data/00000-0-ayushsaxena_20250910102137_22c067b7-d899-4ecf-9832-c167f7d402a6-job_17575248807800_0001-1-00001.orc: > > DV{location=hdfs://localhost:51198/build/ql/test/data/warehouse/ice01/data/00000-0-ayushsaxena_20250910102141_dc808c4f-746c-4a38-b2dc-ba6b8d719f44-job_17575248807800_0001-2-00001-pos-deletes.orc, > offset=4, length=42, > referencedDataFile=hdfs://localhost:51198/build/ql/test/data/warehouse/ice01/data/00000-0-ayushsaxena_20250910102137_22c067b7-d899-4ecf-9832-c167f7d402a6-job_17575248807800_0001-1-00001.orc} > and > DV{location=hdfs://localhost:51198/build/ql/test/data/warehouse/ice01/data/00000-0-ayushsaxena_20250910102142_300e55da-c854-415e-854b-6a0b9ac641da-job_17575248807800_0001-3-00001-pos-deletes.orc, > offset=4, length=44, > referencedDataFile=hdfs://localhost:51198/build/ql/test/data/warehouse/ice01/data/00000-0-ayushsaxena_20250910102137_22c067b7-d899-4ecf-9832-c167f7d402a6-job_17575248807800_0001-1-00001.orc} > at > org.apache.iceberg.DeleteFileIndex$Builder.add(DeleteFileIndex.java:509) > at > org.apache.iceberg.DeleteFileIndex$Builder.build(DeleteFileIndex.java:481) > at org.apache.iceberg.ManifestGroup.plan(ManifestGroup.java:185) > at org.apache.iceberg.ManifestGroup.planFiles(ManifestGroup.java:172) > at org.apache.iceberg.DataTableScan.doPlanFiles(DataTableScan.java:90) > at org.apache.iceberg.SnapshotScan.planFiles(SnapshotScan.java:139) > at org.apache.iceberg.BaseTableScan.planTasks(BaseTableScan.java:44) > at org.apache.iceberg.DataTableScan.planTasks(DataTableScan.java:26) > at > org.apache.iceberg.mr.mapreduce.IcebergInputFormat.generateInputSplits(IcebergInputFormat.java:230) > at > org.apache.iceberg.mr.mapreduce.IcebergInputFormat.planInputSplits(IcebergInputFormat.java:199) > at > org.apache.iceberg.mr.mapreduce.IcebergInputFormat.getSplits(IcebergInputFormat.java:172) > at > org.apache.iceberg.mr.mapred.MapredIcebergInputFormat.getSplits(MapredIcebergInputFormat.java:69) > at > org.apache.iceberg.mr.hive.HiveIcebergInputFormat.getSplits(HiveIcebergInputFormat.java:167) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:585) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:880) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:363){noformat} > The reason being: Iceberg V3 only allows one DV per DataFile. > Related Iceberg code: > https://github.com/apache/iceberg/blob/720ef99720a1c59e4670db983c951243dffc4f3e/core/src/main/java/org/apache/iceberg/DeleteFileIndex.java#L507-L509 -- This message was sent by Atlassian Jira (v8.20.10#820010)