ericzeng created FLINK-39342:
--------------------------------
Summary: [Iceberg] Support hadoop.conf.* prefix to pass Hadoop
configuration properties
Key: FLINK-39342
URL: https://issues.apache.org/jira/browse/FLINK-39342
Project: Flink
Issue Type: New Feature
Components: Flink CDC
Affects Versions: cdc-3.5.0
Reporter: ericzeng
The Iceberg pipeline connector currently relies on classpath-based Hadoop
configuration files (core-site.xml, hdfs-site.xml) to configure Hadoop
settings. There is no way to pass Hadoop configuration properties (e.g., S3
credentials, HDFS endpoint, Kerberos settings) directly through the
connector's job configuration.
*Motivation*
In many deployment environments (containerized, cloud-native, or multi-tenant
clusters), users cannot easily place Hadoop XML config files on the classpath.
They need a way to set Hadoop properties programmatically via the
pipeline job configuration — similar to how catalog.properties.* is already
supported.
*Description*
Add support for a new hadoop.conf.* prefix in the Iceberg pipeline sink
connector. Any property with this prefix will be stripped of the prefix and
applied to the underlying Hadoop Configuration object before it is passed to
CatalogUtil.buildIcebergCatalog(), IcebergWriter, IcebergCommitter, and
CompactionOperator.
*Example usage:*
sink:
type: iceberg
catalog.properties.type: hadoop
catalog.properties.warehouse: s3a://my-bucket/warehouse
hadoop.conf.fs.s3a.access.key: xxxxxx
hadoop.conf.fs.s3a.secret.key: xxx
hadoop.conf.fs.s3a.endpoint: s3.us-east-1.amazonaws.com
--
This message was sent by Atlassian Jira
(v8.20.10#820010)