JNSimba commented on PR #3767:
URL: https://github.com/apache/flink-cdc/pull/3767#issuecomment-2795676279

   > @JNSimba In the backfill task, there may be one problem for projection. 
See `RecordUtils.upsertBinlog` in `SnapshotSplitReader#pollSplitRecords`.
   > 
   > For tables with PK, we use all PK columns to deduplicate the records. For 
tables with no PK, we use all columns to deduplicate the records.
   > 
   > So the limit for the projection is as follows. For tables with PK, we need 
to keep all PK columns. For tables with no PK, we need to keep all columns.
   
   @ruanhang1993 Thanks for your suggestions. But I modified a version, but 
found some new problems: 
   
   Like this case: **MySqlTimezoneITCase.testMySqlServerInBerlin**
   
   The id is the primary key, the inconsistency between **the actual projected 
fields (date_c, time_c, datetime3_c, datetime6_c, timestamp_c , id)** and the 
fields to be projected in the **select (date_c, time_c, datetime3_c, 
datetime6_c, timestamp_c )** may cause some problems.
   
   In non-incremental snapshot, **RowDataSerializer.copy** will force the 
judgment whether the type matches the data, so an error will be reported
   
https://github.com/apache/flink/blob/release-1.20.1/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/typeutils/RowDataSerializer.java#L122-L125
   ```java
   // from.getArity()=6 and   types.length =5
   if (from.getArity() != types.length) {
   throw new IllegalArgumentException(
   "Row arity: " + from.getArity() + ", but serializer arity: " + types.length);
   }
   ```
   For this case , it is normal when incrementalSnapshot=true, but it is wrong 
when false.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to