[GitHub] [doris] wsjz commented on a diff in pull request #21238: [fix](multi-catalog)fix obj file cache and dlf iceberg catalog

via GitHub Wed, 28 Jun 2023 23:35:37 -0700


wsjz commented on code in PR #21238:
URL: https://github.com/apache/doris/pull/21238#discussion_r1246169572



##########
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/dlf/DLFCatalog.java:
##########
@@ -38,4 +47,26 @@ protected TableOperations newTableOps(TableIdentifier 
tableIdentifier) {
         String tableName = tableIdentifier.name();
         return new DLFTableOperations(this.conf, this.clients, this.fileIO, 
this.uid, dbName, tableName);
     }
+
+    protected FileIO initializeFileIO(Map<String, String> properties, 
Configuration hadoopConf) {
+        // read from converted properties or default by old s3 aws properties
+        String endpoint = properties.getOrDefault(Constants.ENDPOINT_KEY, 
properties.get(S3Properties.Env.ENDPOINT));
+        CloudCredential credential = new CloudCredential();
+        
credential.setAccessKey(properties.getOrDefault(OssProperties.ACCESS_KEY,
+                    properties.get(S3Properties.Env.ACCESS_KEY)));
+        
credential.setSecretKey(properties.getOrDefault(OssProperties.SECRET_KEY,
+                    properties.get(S3Properties.Env.SECRET_KEY)));
+        if (properties.containsKey(OssProperties.SESSION_TOKEN)
+                || properties.containsKey(S3Properties.Env.TOKEN)) {
+            
credential.setSessionToken(properties.getOrDefault(OssProperties.SESSION_TOKEN,
+                    properties.get(S3Properties.Env.TOKEN)));
+        }
+        String region = properties.getOrDefault(OssProperties.REGION, 
properties.get(S3Properties.Env.REGION));
+        // s3 file io just supports s3-like endpoint
+        String s3Endpoint = endpoint.replace(region, "s3." + region);
+        URI endpointUri = URI.create(s3Endpoint);
+        FileIO io = new S3FileIO(() -> S3Util.buildS3Client(endpointUri, 
region, credential));

Review Comment:
   I find s3 file io is faster than hadoop file io



##########
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/HiveCompatibleCatalog.java:
##########
@@ -57,7 +57,7 @@ public void initialize(String name, FileIO fileIO,
     protected FileIO initializeFileIO(Map<String, String> properties, 
Configuration hadoopConf) {
         String fileIOImpl = properties.get(CatalogProperties.FILE_IO_IMPL);
         if (fileIOImpl == null) {
-            FileIO io = new S3FileIO();
+            FileIO io = new HadoopFileIO(hadoopConf);

Review Comment:
   s3 file need some custom configuration, so hadoop io is used in superclass 
by default, we can add better implementations to derived class just like the 
implementation in dlf catalog
   .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [doris] wsjz commented on a diff in pull request #21238: [fix](multi-catalog)fix obj file cache and dlf iceberg catalog

Reply via email to