lvyanquan opened a new pull request, #4391:
URL: https://github.com/apache/flink-cdc/pull/4391

     
     #### Summary
   
     This commit adds BLOB field support to the Flink CDC Paimon connector, 
enabling efficient storage and handling of large binary data during CDC 
synchronization operations.
   
     #### Key Changes
   
     1. New BlobWriteContext Component
     - Introduced BlobWriteContext class to handle BLOB fields during CDC write 
operations
     - Supports two blob storage modes:
       - Mode 1 (raw data): VARBINARY/BINARY fields → BlobData → written to 
.blob files
       - Mode 2 (descriptor): VARCHAR/STRING fields → BlobRef → only descriptor 
(uri, offset, length) stored inline
     - Integrates with Paimon's CoreOptions for blob configuration
     
     2. Schema Evolution Support
     - Enhanced SchemaChangeProvider to automatically convert 
VARBINARY/BINARY/VARCHAR/STRING types to BLOB type based on blob-field 
configuration
     - Updated updateColumnType method to handle BLOB type conversion during 
schema changes
     - Added validation to prevent altering primary key or partition key 
columns to BLOB type
     
     3. Writer Integration
     - Modified PaimonWriterHelper to support blob field handling
     - Updated PaimonRecordEventSerializer for BLOB data serialization
     - Enhanced TableSchemaInfo to track blob field metadata
     
     4. Comprehensive Testing
     - Added PaimonMetadataApplierTest with 468 lines of test coverage
     - Added PaimonWriterHelperTest for blob write scenarios
     - Added AppendOnlyTableITCase integration tests with test fixtures
   
     #### Configuration Example
   
     ##### Enable blob fields via table options
     blob-field = content, image_data
     blob-descriptor-field = external_file_path
   
     ##### Enable blob fields via table options
     blob-field = content, image_data
     blob-descriptor-field = external_file_path
   
     #### JIRA Reference
   
     https://issues.apache.org/jira/browse/FLINK-39567


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to