q8webmaster opened a new pull request, #8222:
URL: https://github.com/apache/paimon/pull/8222

   ## Problem
   
   When syncing a PostgreSQL table that contains a `timestamp(n)` column with 
`n <= 3` (i.e. millisecond precision), the Paimon CDC job crashes at the WAL / 
streaming transition with:
   
   ```
   java.lang.UnsupportedOperationException:
     Cannot convert field <col> from type TIMESTAMP(3) NOT NULL to BIGINT NOT 
NULL
   ```
   
   The job survives the snapshot phase (sometimes hours) and always fails at 
the point where the connector switches from snapshot to streaming.
   
   ## Root cause
   
   There are two code paths that derive a Paimon `DataType` for a PostgreSQL 
column:
   
   | Path | Input | Result |
   |---|---|---|
   | JDBC (startup / table creation via `retrieveSchema`) | `DatabaseMetaData` 
type `"timestamp"`, scale `3` | `TIMESTAMP(3)` ✅ |
   | `PostgresRecordParser.extractFieldType` (CDC record parsing) | Debezium 
`int64` / `io.debezium.time.Timestamp` | `BIGINT` ❌ |
   
   PostgreSQL `timestamp(n)` with `n <= 3` is encoded by Debezium using the 
`io.debezium.time.Timestamp` logical type (epoch-millis, int64). The 
`extractFieldType` `int64` case only handles `MicroTimestamp` and `MicroTime`, 
so `io.debezium.time.Timestamp` silently falls through to `return 
DataTypes.BIGINT()`.
   
   The table is created with `TIMESTAMP(3)` from the JDBC path. When the first 
WAL record arrives, `extractFieldType` returns `BIGINT`, schema evolution fires 
an `UpdateColumnType(TIMESTAMP(3) → BIGINT)` change, `canConvert` returns 
`EXCEPTION`, and the job crashes.
   
   ## Fix
   
   Add the missing branch to the `int64` case in `extractFieldType`:
   
   ```java
   } else if (Timestamp.SCHEMA_NAME.equals(field.name())) {
       return DataTypes.TIMESTAMP(3);
   }
   ```
   
   `io.debezium.time.Timestamp` represents epoch-milliseconds, so 
`TIMESTAMP(3)` is the correct Paimon type — matching the JDBC path exactly. 
With both paths in agreement, `parseSchemaChange` sees no field type difference 
and no schema evolution fires.
   
   ## Prior art
   
   PR #6239 fixed a structurally identical bug for `DECIMAL` (`bytes / 
org.apache.kafka.connect.data.Decimal` falling through to `BYTES`). This PR 
applies the same pattern to `Timestamp`.
   
   ## Changes
   
   - `PostgresRecordParser.java`: add `Timestamp.SCHEMA_NAME` branch to `int64` 
case in `extractFieldType`
   - `PostgresRecordParserTest.java`: three unit tests covering 
`io.debezium.time.Timestamp` → `TIMESTAMP(3)`, `MicroTimestamp` → 
`TIMESTAMP(6)` (regression), and plain `int64` → `BIGINT` (regression)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to