wombatu-kun opened a new issue, #16603:
URL: https://github.com/apache/iceberg/issues/16603

   ### Problem
   
   `MongoDataConverter` (used by the `MongoDebeziumTransform` SMT) reads BSON 
array elements with the wrong accessors for `TIMESTAMP` and `DATE_TIME` when 
`array.encoding=array` (the default). In the private array-element 
`convertFieldValue`:
   
   ```java
   } else if (arrValue.getBsonType() == BsonType.DATE_TIME && valueType == 
BsonType.DATE_TIME) {
     Date temp = new Date(arrValue.asInt64().getValue());          // arrValue 
is a BsonDateTime, not BsonInt64
     ...
   } else if (arrValue.getBsonType() == BsonType.TIMESTAMP && valueType == 
BsonType.TIMESTAMP) {
     Date temp = new Date(1000L * arrValue.asInt32().getValue());  // arrValue 
is a BsonTimestamp, not BsonInt32
   ```
   
   `BsonValue.asInt32()`/`asInt64()` call `throwIfInvalidType(...)`, which 
throws `BsonInvalidOperationException` when the actual BSON type is 
`TIMESTAMP`/`DATE_TIME`. The scalar (non-array) paths use the correct accessors 
(`asTimestamp().getTime()` and `asDateTime().getValue()`); only the 
array-element path is wrong.
   
   ### Impact
   
   A MongoDB document containing an array of timestamps or date-times fails to 
convert with `BsonInvalidOperationException` (surfaced as a Connect 
`DataException`), so the record cannot be processed. `array.encoding=array` is 
the default mode in `MongoDebeziumTransform`, and arrays of dates are common in 
document data.
   
   ### Reproduction
   
   These tests fail on current code (built programmatically with 
`BsonTimestamp` / `BsonDateTime` array elements and `ArrayEncoding.ARRAY`):
   
   ```
   shouldConvertArrayOfTimestamps
     -> BsonInvalidOperationException: Value expected to be of type INT32 is of 
unexpected type TIMESTAMP
        at MongoDataConverter.java:245 (asInt32)
   
   shouldConvertArrayOfDateTimes
     -> BsonInvalidOperationException: Value expected to be of type INT64 is of 
unexpected type DATE_TIME
        at MongoDataConverter.java:239 (asInt64)
   ```
   
   ### Fix
   
   Use the same accessors as the scalar paths:
   
   - `TIMESTAMP`: `new Date(1000L * arrValue.asTimestamp().getTime())`
   - `DATE_TIME`: `new Date(arrValue.asDateTime().getValue())`
   
   Note: this file lives under `org.debezium.connector.mongodb.transforms` and 
was adapted from Debezium; the fix intentionally diverges from the (buggy) 
upstream snapshot.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to