q8webmaster opened a new pull request, #8222:
URL: https://github.com/apache/paimon/pull/8222
## Problem
When syncing a PostgreSQL table that contains a `timestamp(n)` column with
`n <= 3` (i.e. millisecond precision), the Paimon CDC job crashes at the WAL /
streaming transition with:
```
java.lang.UnsupportedOperationException:
Cannot convert field <col> from type TIMESTAMP(3) NOT NULL to BIGINT NOT
NULL
```
The job survives the snapshot phase (sometimes hours) and always fails at
the point where the connector switches from snapshot to streaming.
## Root cause
There are two code paths that derive a Paimon `DataType` for a PostgreSQL
column:
| Path | Input | Result |
|---|---|---|
| JDBC (startup / table creation via `retrieveSchema`) | `DatabaseMetaData`
type `"timestamp"`, scale `3` | `TIMESTAMP(3)` ✅ |
| `PostgresRecordParser.extractFieldType` (CDC record parsing) | Debezium
`int64` / `io.debezium.time.Timestamp` | `BIGINT` ❌ |
PostgreSQL `timestamp(n)` with `n <= 3` is encoded by Debezium using the
`io.debezium.time.Timestamp` logical type (epoch-millis, int64). The
`extractFieldType` `int64` case only handles `MicroTimestamp` and `MicroTime`,
so `io.debezium.time.Timestamp` silently falls through to `return
DataTypes.BIGINT()`.
The table is created with `TIMESTAMP(3)` from the JDBC path. When the first
WAL record arrives, `extractFieldType` returns `BIGINT`, schema evolution fires
an `UpdateColumnType(TIMESTAMP(3) → BIGINT)` change, `canConvert` returns
`EXCEPTION`, and the job crashes.
## Fix
Add the missing branch to the `int64` case in `extractFieldType`:
```java
} else if (Timestamp.SCHEMA_NAME.equals(field.name())) {
return DataTypes.TIMESTAMP(3);
}
```
`io.debezium.time.Timestamp` represents epoch-milliseconds, so
`TIMESTAMP(3)` is the correct Paimon type — matching the JDBC path exactly.
With both paths in agreement, `parseSchemaChange` sees no field type difference
and no schema evolution fires.
## Prior art
PR #6239 fixed a structurally identical bug for `DECIMAL` (`bytes /
org.apache.kafka.connect.data.Decimal` falling through to `BYTES`). This PR
applies the same pattern to `Timestamp`.
## Changes
- `PostgresRecordParser.java`: add `Timestamp.SCHEMA_NAME` branch to `int64`
case in `extractFieldType`
- `PostgresRecordParserTest.java`: three unit tests covering
`io.debezium.time.Timestamp` → `TIMESTAMP(3)`, `MicroTimestamp` →
`TIMESTAMP(6)` (regression), and plain `int64` → `BIGINT` (regression)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]