[
https://issues.apache.org/jira/browse/FLINK-36580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-36580:
-----------------------------------
Labels: pull-request-available (was: )
> Using debezium.column.exclude.list causes job failure
> -----------------------------------------------------
>
> Key: FLINK-36580
> URL: https://issues.apache.org/jira/browse/FLINK-36580
> Project: Flink
> Issue Type: Bug
> Components: Flink CDC
> Affects Versions: cdc-3.2.0
> Reporter: MOBIN
> Priority: Minor
> Labels: pull-request-available
>
> When the user uses \* (too many columns) when synchronizing data and wants to
> ignore some privacy columns, the pipeline yml is as follows:
> {code:java}
> source:
> type: mysql
> ....
> debezium.column.exclude.list: test_db.test_table.name
> transform:
> - source-table: test_db.test_table
> projection: \*
> description: project fields and filter {code}
> an error will be reported:
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 33370909
> at
> org.apache.flink.cdc.common.data.binary.BinarySegmentUtils.getLongMultiSegments(BinarySegmentUtils.java:731)
> at
> org.apache.flink.cdc.common.data.binary.BinarySegmentUtils.getLong(BinarySegmentUtils.java:721)
>
> at
> org.apache.flink.cdc.common.data.binary.BinarySegmentUtils.readTimestampData(BinarySegmentUtils.java:1017)
>
> at org.apache.flink.cdc.common.
> data.binary.BinaryRecordData.getTimestamp(BinaryRecordData.java:173)
> at
> org.apache.flink.cdc.common.data.RecordData.lambda$createFieldGetter$f6fd429a$1(RecordData.java:214)
>
> at
> org.apache.flink.cdc.common.data.RecordData.lambda$createFieldGetter$79d0aaaa$1(RecordData.java:245)
>
> at
> org.apache.flink.cdc.runtime.operators.transform.PreTransformProcessor.getValueFromBinaryRecordData(PreTransformProcessor.java:111)
>
> at org.apache.flink.cdc.run
> time.operators.transform.PreTransformProcessor.processFillDataField(PreTransformProcessor.java:90)
>
> at
> org.apache.flink.cdc.runtime.operators.transform.PreTransformOperator.processProjection(PreTransformOperator.java:427)
>
> at
> org.apache.flink.cdc.runtime.operators.transform.PreTransformOperator.processDataChangeEvent(PreTransformOperator.java:399)
>
> at
> org.apache.flink.cdc.runtime.operators.transform.PreTransformOperator.processElement(PreTransformOperator.java:251)
>
> at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushTo
> Operator(CopyingChainingOutput.java:75)
> at
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:50)
>
> at
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:29)
>
> at
> org.apache.flink.streaming.runtime.tasks.SourceOperatorStreamTask$AsyncDataOutputToOutput.emitRecord(SourceOperatorStreamTask.java:310)
> at
> org.apache.flink.streaming.api.operators.source.SourceOutputWithWatermarks.collect(SourceOutputWithWatermarks.java:110)
> {code}
> Cause of error: debezium will ignore the debezium.column.exclude.list field
> when collecting data, but cdc does not take this into account when obtaining
> the table schema, resulting in the inconsistency between the number of
> columns in tableChangeInfo.getSourceSchema() and the number of columns in
> BinaryRecordData
> Temporary solution: projection: \*, cast(null as STRING) as name
--
This message was sent by Atlassian Jira
(v8.20.10#820010)