ericzeng created FLINK-39342:
--------------------------------

             Summary:  [Iceberg] Support hadoop.conf.* prefix to pass Hadoop 
configuration properties
                 Key: FLINK-39342
                 URL: https://issues.apache.org/jira/browse/FLINK-39342
             Project: Flink
          Issue Type: New Feature
          Components: Flink CDC
    Affects Versions: cdc-3.5.0
            Reporter: ericzeng


The Iceberg pipeline connector currently relies on classpath-based Hadoop 
configuration files (core-site.xml, hdfs-site.xml) to configure Hadoop 
settings. There is no way to pass Hadoop configuration properties (e.g., S3     
  credentials, HDFS endpoint, Kerberos settings) directly through the 
connector's job configuration.                                                  
                                                                                
                                       
*Motivation*                                                

In many deployment environments (containerized, cloud-native, or multi-tenant 
clusters), users cannot easily place Hadoop XML config files on the classpath. 
They need a way to set Hadoop properties programmatically via the   
  pipeline job configuration — similar to how catalog.properties.* is already 
supported.

*Description*

Add support for a new hadoop.conf.* prefix in the Iceberg pipeline sink 
connector. Any property with this prefix will be stripped of the prefix and 
applied to the underlying Hadoop Configuration object before it is passed to 
  CatalogUtil.buildIcebergCatalog(), IcebergWriter, IcebergCommitter, and 
CompactionOperator.

 

*Example usage:*                                            

  sink:
    type: iceberg
    catalog.properties.type: hadoop
    catalog.properties.warehouse: s3a://my-bucket/warehouse
    hadoop.conf.fs.s3a.access.key: xxxxxx
    hadoop.conf.fs.s3a.secret.key: xxx
    hadoop.conf.fs.s3a.endpoint: s3.us-east-1.amazonaws.com



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to