tchivs opened a new issue, #6654:
URL: https://github.com/apache/paimon/issues/6654

   ## Description
   
   `CatalogContext` currently has a hard dependency on Hadoop Configuration, 
causing `NoClassDefFoundError` in environments where Hadoop is not needed or 
available.
   
   ## Use Cases Affected
   
   ### 1. Trino-Paimon Connector
   The Trino connector for Paimon doesn't require Hadoop dependencies for many 
scenarios, but currently fails due to the hard dependency in `CatalogContext`.
   
   ### 2. Windows Development Environment
   When using Paimon on Windows (e.g., Flink CDC with Paimon sink to MinIO S3), 
the application fails with:
   
   ```
   Caused by: java.lang.NoClassDefFoundError: 
org/apache/hadoop/conf/Configuration
       at 
org.apache.paimon.catalog.CatalogContext.<init>(CatalogContext.java:53)
       at 
org.apache.paimon.catalog.CatalogContext.create(CatalogContext.java:73)
       at 
org.apache.paimon.flink.FlinkCatalogFactory.createPaimonCatalog(FlinkCatalogFactory.java:81)
       at 
org.apache.flink.cdc.connectors.paimon.sink.v2.bucket.BucketAssignOperator.open(BucketAssignOperator.java:103)
       ...
   Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.conf.Configuration
   ```
   
   ### 3. Lightweight Deployment Scenarios
   - Local FileIO usage shouldn't require Hadoop
   - Cloud-native deployments using native S3/OSS clients
   - Embedded/lightweight environments
   
   ## Problem
   
   The current `CatalogContext` implementation:
   ```java
   public class CatalogContext {
       private final Configuration hadoopConf; // Always requires Hadoop
       
       private CatalogContext(..., Configuration hadoopConf) {
           this.hadoopConf = hadoopConf; // Hard dependency
       }
   }
   ```
   
   ## Proposed Solution
   
   Introduce interface segregation:
   1. `ICatalogContext` - Core interface without Hadoop dependency
   2. `HadoopAware` - Optional interface for Hadoop functionality
   3. `CatalogHadoopContext` - Implementation with Hadoop support
   
   This allows components to work in Hadoop-free environments while maintaining 
backward compatibility.
   
   ## Impact
   
   - Enables Trino-Paimon connector usage without Hadoop
   - Fixes Windows development environment issues
   - Reduces dependency footprint for cloud-native deployments
   - Improves overall architecture (interface segregation principle)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to