fallintoplace opened a new issue, #1082: URL: https://github.com/apache/iceberg-go/issues/1082
### Apache Iceberg version main (development) ### Please describe the bug `positionDeletePartitionedFanoutWriter` appears to derive partition paths using the positional-delete schema rather than the table schema. In `positionDeletePartitionedFanoutWriter.partitionPath`, the partition type and path are currently built from `p.schema`: ```go data := newPartitionRecord(partitionContext.partitionData, spec.PartitionType(p.schema)) return spec.PartitionToPath(data, p.schema), nil ``` The writer initializes `p.schema` to `iceberg.PositionalDeleteSchema`. For partition specs based on table data columns, those source fields are not present in the positional-delete schema. `PartitionSpec.PartitionType` skips missing source fields, so the derived partition type can be empty and `PartitionToPath` can return an empty path. That means different target data-file partitions can collapse to the same rolling writer key. Since `writerFactory.getOrCreateRollingDataWriter` reuses writers by partition path, the delete file can be opened with the first partition's metadata and then receive position-delete rows for later target files from different partitions. For example, for a table partitioned by `day`: | Target data file | Real partition | | --- | --- | | `file-a.parquet` | `day=2026-05-01` | | `file-b.parquet` | `day=2026-05-02` | If the position-delete writer derives the partition path from `iceberg.PositionalDeleteSchema`, both can produce the same empty partition path. The rolling writer can then reuse the writer opened for `day=2026-05-01` while writing deletes that target `day=2026-05-02`. The expected behavior is that the positional-delete file schema is still used for the delete file contents, but the table schema is used when deriving the table partition path from the target data file's partition data. This looks related to, but distinct from, #767, which fixed nondeterministic partition path ordering. This issue is about using the delete schema instead of the table schema when deriving the path. I have contributed to iceberg-python before, but this would be my first iceberg-go PR. I would like to work on a fix if this diagnosis sounds right. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
