[ 
https://issues.apache.org/jira/browse/FLINK-38196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18012311#comment-18012311
 ] 

Hongshun Wang commented on FLINK-38196:
---------------------------------------

[~tchivs] Because we haven't add decimal.handling.mode  in cdc now. Thus the 
type inference still doesn't work.

> IndexOutOfBoundsException when processing PostgreSQL tables with numeric(0) 
> fields in Flink CDC
> -----------------------------------------------------------------------------------------------
>
>                 Key: FLINK-38196
>                 URL: https://issues.apache.org/jira/browse/FLINK-38196
>             Project: Flink
>          Issue Type: Bug
>          Components: Flink CDC
>    Affects Versions: cdc-3.4.0
>         Environment: - **Database**: PostgreSQL with tables containing 
> `numeric(0)` columns
> - **Connector**: PostgreSQL CDC Connector
> - **Configuration**: Any `debezium.decimal.handling.mode` setting
>            Reporter: tchivs
>            Priority: Major
>         Attachments: image-2025-08-06-16-14-20-192.png
>
>
> h3. Problem Statement
> Flink CDC fails with {{IndexOutOfBoundsException}} when processing PostgreSQL 
> tables containing {{numeric(0)}} fields, particularly when these fields have 
> NULL values. This causes complete pipeline failure during binary decimal data 
> deserialization.
> h3. Error Details
> {{Caused by: java.lang.IndexOutOfBoundsException: pos: -2130706416, length: 
> 48, index: -2130706432, offset: 0
> at org.apache.flink.core.memory.MemorySegment.get(MemorySegment.java:467)
> at 
> org.apache.flink.cdc.common.data.binary.BinarySegmentUtils.copyToBytes(BinarySegmentUtils.java:131)
> at 
> org.apache.flink.cdc.common.data.binary.BinarySegmentUtils.readDecimalData(BinarySegmentUtils.java:1003)
> at 
> org.apache.flink.cdc.common.data.binary.BinaryRecordData.getDecimal(BinaryRecordData.java:163)
> at 
> org.apache.flink.cdc.common.data.RecordData.lambda$createFieldGetter$7b8ca8ef$1(RecordData.java:195)}}
> h3. Steps to Reproduce
>  # Create a PostgreSQL table with {{numeric(0)}} columns:
> {{CREATE TABLE test_table (
> id SERIAL PRIMARY KEY,
> amount numeric, – This causes the issue
> name VARCHAR(100)
> );}}
>  # Insert data including NULL values:
> {{INSERT INTO test_table (amount, name) VALUES (NULL, 'test');
> INSERT INTO test_table (amount, name) VALUES (123, 'test2');}}
>  # Configure Flink CDC pipeline:
> {{source:
> type: postgres
> hostname: localhost
> port: 5432
> username: postgres
> password: password
> database-name: testdb
> schema-name: public
> table-name: test_table
> debezium.decimal.handling.mode: string}}
>  # Run the pipeline - it will crash with IndexOutOfBoundsException
> h3. Expected Result
>  * Pipeline should successfully process all rows including those with NULL 
> {{numeric(0)}} values
>  * No exceptions should be thrown during data processing
> h3. Actual Result
>  * Pipeline crashes with {{IndexOutOfBoundsException}}
>  * Processing stops completely, preventing any data from being synchronized
> h3. Root Cause Analysis
>  # {*}Type Mapping Issue{*}: PostgreSQL {{numeric(0)}} fields are incorrectly 
> mapped to {{DECIMAL}} with maximum precision in {{PostgresTypeUtils.java}}
>  # {*}Binary Serialization Problem{*}: High-precision DECIMAL types create 
> complex binary representations that are prone to corruption
>  # {*}Missing Validation{*}: {{BinarySegmentUtils.readDecimalData()}} lacks 
> bounds checking for invalid binary data, allowing negative memory offsets
> h3. Impact Assessment
>  * {*}Severity{*}: Major - Complete pipeline failure
>  * {*}Scope{*}: Affects any PostgreSQL database using {{numeric(0)}} columns
>  * {*}Workaround{*}: None available (users must modify table schema)
> h3. Proposed Solution
>  # Map PostgreSQL {{numeric(0)}} fields to {{BIGINT}} instead of {{DECIMAL}} 
> to avoid complex binary serialization
>  # Add defensive bounds checking in {{BinarySegmentUtils.readDecimalData()}}
>  # Handle edge cases gracefully by returning zero values instead of crashing
> h3. Additional Context
>  * This issue commonly occurs in PostgreSQL databases where {{numeric(0)}} is 
> used to store whole numbers without decimal places
>  * The problem is exacerbated when fields are nullable, as NULL values create 
> invalid binary offset calculations
>  * Different {{debezium.decimal.handling.mode}} settings (string, double, 
> precise) all exhibit the same issue
> h3. Test Case Requirements
>  * Unit tests for {{PostgresTypeUtils}} covering {{numeric(0)}} mapping
>  * Unit tests for {{BinarySegmentUtils}} defensive bounds checking
>  * Integration test with actual PostgreSQL table containing {{numeric(0)}} 
> fields
> h3. Documentation Impact
>  * Update connector documentation to clarify {{numeric(0)}} field handling
>  * Add troubleshooting section for IndexOutOfBoundsException issues



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to