norrishuang opened a new pull request, #4314:
URL: https://github.com/apache/flink-cdc/pull/4314

   ## Summary
   
   Add AWS Glue Catalog support for the Iceberg pipeline sink connector. 
Previously, the Iceberg pipeline only supported `hadoop` and `hive` catalog 
types. This PR enables users to use AWS Glue Data Catalog as the Iceberg 
catalog by setting `catalog.properties.type: glue`.
   
   ## Changes
   
   ### New Configuration Options
   
   | Option | Type | Description |
   |--------|------|-------------|
   | `catalog.properties.type` | String | Now supports `glue` in addition to 
`hadoop` and `hive` |
   | `catalog.properties.catalog-impl` | String | Custom catalog implementation 
class (e.g. `org.apache.iceberg.aws.glue.GlueCatalog`) |
   | `catalog.properties.io-impl` | String | Custom FileIO implementation (e.g. 
`org.apache.iceberg.aws.s3.S3FileIO`) |
   | `catalog.properties.glue.id` | String | Glue Catalog ID (AWS account ID) 
for cross-account access |
   | `catalog.properties.glue.skip-archive` | Boolean | Skip archiving older 
table versions in Glue (default: true) |
   | `catalog.properties.glue.skip-name-validation` | Boolean | Skip name 
validation for Glue catalog (default: false) |
   | `catalog.properties.client.region` | String | AWS region for the Glue 
catalog client |
   
   ### Files Modified
   
   - `IcebergDataSinkOptions.java` — Added Glue-related config options, updated 
`TYPE` and `WAREHOUSE` descriptions
   - `IcebergDataSinkFactory.java` — Registered new optional config options
   - `IcebergDataSinkFactoryTest.java` — Added 2 test cases for Glue catalog 
creation (via `type=glue` and `catalog-impl`)
   - `pom.xml` — Dependency and shade plugin adjustments
   
   ## Usage Example
   
   ```yaml
   sink:
     type: iceberg
     catalog.properties.type: glue
     catalog.properties.warehouse: s3://my-bucket/warehouse/
     catalog.properties.io-impl: org.apache.iceberg.aws.s3.S3FileIO
     catalog.properties.client.region: us-east-1
   ```
   
   ## How It Works
   Iceberg's CatalogUtil.buildIcebergCatalog() natively supports type=glue and 
automatically loads org.apache.iceberg.aws.glue.GlueCatalog. This PR exposes 
the necessary configuration options through the Flink CDC pipeline config layer 
and ensures the Glue-related catalog properties are correctly passed through 
via the catalog.properties.* prefix.
   ### Testing
   Unit tests pass (6/6 in IcebergDataSinkFactoryTest)
   Verified end-to-end on Amazon EMR (Flink 1.20, Iceberg 1.10.0-amzn) with 
MySQL CDC → Iceberg (Glue Catalog + S3)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to