aokolnychyi commented on code in PR #5665:
URL: https://github.com/apache/iceberg/pull/5665#discussion_r960111505


##########
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkDataFile.java:
##########
@@ -84,10 +93,31 @@ public SparkDataFile(Types.StructType type, StructType 
sparkType) {
     sortOrderIdPosition = positions.get("sort_order_id");
   }
 
+  private void wrapPartitionSpec(GenericRowWithSchema specRow) {
+    // We get all the partition fields, but want to project to the current one
+    StructType wrappedPartitionStruct = specRow.schema();
+
+    if (!wrappedPartitionStruct.equals(currentWrappedPartitionStruct)) {
+      this.currentWrappedPartitionStruct = wrappedPartitionStruct;
+
+      // The original IDs are lost in translation, therefore we apply the ones 
that we know

Review Comment:
   Yeah, we need something closer to what we have in `SparkPositionDeltaWrite`, 
where we essentially solve the same problem (the incoming partition tuple may 
have more columns that the spec we are trying to write).
   
   We can pass `Broadcast<Table> table` to `writeManifest`, which would allow 
us to construct the union partition type via `Partitioning$partitionType` as 
well as have access to the spec we are trying to write to.
   
   I am not yet sure where the projection should be. We could either change the 
constructor of `SparkDataFile` and init it there.
   
   ```
   public SparkDataFile(PartitionSpec spec, Types.StructType partitionType, 
StructType sparkType) {
     ...
     this.partitionWrapper = new SparkStructLike(partitionType);
     this.partitionProjection = StructProjection.create(partitionType, 
spec.partitionType());
     ...
   }
   ```
   
   Or pass a precomputed `StructProjection partitionProjection`, which we would 
build in `writeManifest`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to