aokolnychyi commented on code in PR #6898:
URL: https://github.com/apache/iceberg/pull/6898#discussion_r1113722252


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/ChangelogIterator.java:
##########
@@ -83,12 +85,16 @@ private ChangelogIterator(
    * @param changeTypeIndex the index of the change type column
    * @param identifierFieldIdx the indices of the identifier columns, which 
determine if rows are
    *     the same
+   * @param columnSize the number of columns in the table
    * @return a new {@link ChangelogIterator} instance concatenated with the 
null-removal iterator
    */
-  public static Iterator<Row> iterator(
-      Iterator<Row> rowIterator, int changeTypeIndex, List<Integer> 
identifierFieldIdx) {
+  public static Iterator<Row> create(

Review Comment:
   What about making the signature as follows?
   
   ```
   public static Iterator<Row> create(
       Iterator<Row> rowIterator,
       StructType rowType,
       List<String> identifierFields) {
   
     // compute indices for the change and identifier columns here
   }
   ```
   
   I feel like we accept some low-level details in this method. Instead, I'd 
accept an iterator, schema, and identifier column names to move the complexity 
from the procedure here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to