hililiwei commented on code in PR #3991:
URL: https://github.com/apache/iceberg/pull/3991#discussion_r849480029


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/data/RowDataProjection.java:
##########
@@ -63,16 +109,48 @@ public static RowDataProjection create(RowType rowType, 
Types.StructType schema,
     return new RowDataProjection(rowType, schema, projectedSchema);
   }
 
-  private final RowData.FieldGetter[] getters;
-  private RowData rowData;
+  private RowDataProjection(RowType rowType, Types.StructType rowStruct, 
Types.StructType projectType,

Review Comment:
   That's true in general. However, a mapping relationship between Iceberg 
Schema(converted from `int[][] projectedFieldIndexes`) and flink source output 
schema is also needed. This relationship actually extracts the fields in 
RowDataProjection in order as the output of the Flink source.
   
   Let's use the example like above.
   
   original schema
   
   ```
   data: string
   id: bigint
   dt: string
   st: struct {
       f0: bigint,
       f1: bigint,
       f2: bigint
   }
   ```
   
   sql:
   
   ```
   select st.f0,id,st.f1 
   ```
   
   `projectedFieldIndexes` :
   
   ```
   {
       {3,0},
       {3,1},
       {1}
   }
   ```
   
   projected flink schema:
   
   ```
   st.f0: bigint
   st.f1: bigint
   id: bigint
   ```
   
   projected iceberg schema(converted from `projectedFieldIndexes):
   
   ```
   id: bigint
   data: struct {
    f0: bigint,
    f1: bigint
   }
   ```
   
   I originally wanted to generate the following schema, Iceberg does not allow 
it:
   
   ```
   data: struct {
    f0: bigint,
   }
   data: struct {
    f1: bigint
   }
   id: bigint
   ```
   
   So in RowDataProjection, we need to the know the mapping between `projected 
flink schema` and `projected flink schema`:
   
   ```
   {
       {1,0},
       {1,1},
       {0}
   }
   ```
   It tells Flink where each field is when it is retrieved.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to