[
https://issues.apache.org/jira/browse/FLINK-39759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mao Jiayi updated FLINK-39759:
------------------------------
Description:
When using the MongoDB YAML connector to read data containing
{{BsonDecimal128}} values, the job may fail if the decimal value has leading
zeros after the decimal point (e.g., {{{}0.0001234{}}}).
The failure happens in {*}CdcTypeConverter.toCdcType(){*}. Java's
{{BigDecimal}} treats leading zeros in the fractional part as insignificant
digits, resulting in a {{precision}} that is smaller than the {{{}scale{}}}.
For example, {{0.0001234}} yields {{precision = 4}} and {{{}scale = 7{}}}.
Flink CDC's *DECIMAL(precision, scale)* type requires {{{}precision >=
scale{}}}, so the type inference throws an exception.
This issue is not exposed for typical {{Decimal128}} values like {{10.99}}
(where {{{}precision = 4{}}}, {{{}scale = 2{}}}) because the constraint
naturally holds. It only manifests when the fractional part contains more
digits than the significant digits — specifically when leading zeros push the
{{scale}} beyond the {{{}precision{}}}.
was:
When using CDC pipelines with schema merging operations that involve *complex
types (ARRAY, MAP, ROW)* being coerced to {*}STRING type{*}, the job may fail
due to missing type conversion logic in {*}SchemaMergingUtils{*}.
The failure happens in {*}SchemaMergingUtils.coerceObject(){*}. When attempting
to coerce complex types to STRING during schema merging, the utility only
handles primitive types and binary data conversions. If the original field is a
MapData, ArrayData, or RecordData, the coercion logic falls through to the
default toString() conversion, which either produces incorrect string
representations or fails entirely because these complex types don't have proper
toString() implementations for string coercion.
This issue is usually not exposed in simple type scenarios because primitive
types (INTEGER, VARCHAR, BOOLEAN, TIMESTAMP, etc.) have direct coercion paths
to STRING through existing conversion rules. The schema merging logic works
correctly for these basic type conversions, as they can be directly transformed
without special handling.
However, when CDC pipelines involve schema evolution or field type changes
where complex types need to be merged into STRING columns, the missing
conversion logic becomes critical. For example, when merging a table with
*ARRAY<INT>* column into a target table with *STRING* column at the same
position, *SchemaMergingUtils* cannot properly serialize the array data
structure into its string representation, causing the merge operation to fail
or produce incorrect results.
This issue is not specific to any particular source connector. It can affect
any CDC pipeline with schema merging scenarios involving complex-to-STRING type
coercion, regardless of whether the source is MySQL, PostgreSQL, Kafka, or any
other distributed pipeline source with schema evolution capabilities.
Summary: Fix MongoDB YAML connector fails to infer type from some
BsonDecimal128 data (was: Fix unable to coerce complex types (ARRAY/MAP/ROW)
to STRING)
> Fix MongoDB YAML connector fails to infer type from some BsonDecimal128 data
> ----------------------------------------------------------------------------
>
> Key: FLINK-39759
> URL: https://issues.apache.org/jira/browse/FLINK-39759
> Project: Flink
> Issue Type: Bug
> Components: Flink CDC
> Reporter: Mao Jiayi
> Priority: Major
> Labels: pull-request-available
>
> When using the MongoDB YAML connector to read data containing
> {{BsonDecimal128}} values, the job may fail if the decimal value has leading
> zeros after the decimal point (e.g., {{{}0.0001234{}}}).
> The failure happens in {*}CdcTypeConverter.toCdcType(){*}. Java's
> {{BigDecimal}} treats leading zeros in the fractional part as insignificant
> digits, resulting in a {{precision}} that is smaller than the {{{}scale{}}}.
> For example, {{0.0001234}} yields {{precision = 4}} and {{{}scale = 7{}}}.
> Flink CDC's *DECIMAL(precision, scale)* type requires {{{}precision >=
> scale{}}}, so the type inference throws an exception.
> This issue is not exposed for typical {{Decimal128}} values like {{10.99}}
> (where {{{}precision = 4{}}}, {{{}scale = 2{}}}) because the constraint
> naturally holds. It only manifests when the fractional part contains more
> digits than the significant digits — specifically when leading zeros push the
> {{scale}} beyond the {{{}precision{}}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)