alamb commented on code in PR #5732:
URL: https://github.com/apache/arrow-datafusion/pull/5732#discussion_r1151134088


##########
datafusion/core/Cargo.toml:
##########
@@ -85,7 +81,7 @@ lazy_static = { version = "^1.4.0" }
 log = "^0.4"
 num-traits = { version = "0.2", optional = true }
 num_cpus = "1.13.0"
-object_store = "0.5.4"
+object_store = { version = "0.5.4", features = ["aws"] }

Review Comment:
   > I think the question here is whether we enable access to cloud storage by 
default for datafusion. If yes, it is the same whether to put the options in 
datafusion or in object_store. If not, I can make this as an optional feature 
for datafusion/core, and turn it on by default in the datafusion-cli.
   
   I think the desire has been to keep ObjectStore specific dependencies out of 
DataFusion core  (though it would be good to document that somewhere). 
DataFusion is already a large codebase to begin with and given it has hooks to 
extend the functionality I think we should be leveraging the extension points. 
   
   
   
   



##########
datafusion/core/src/datasource/listing_table_factory.rs:
##########
@@ -138,6 +142,73 @@ impl TableProviderFactory for ListingTableFactory {
             .with_file_sort_order(file_sort_order);
 
         let table_path = ListingTableUrl::parse(&cmd.location)?;
+
+        // try obtaining all relevant information of object store from 
cmd.options
+        match table_path.scheme() {
+            "s3" => {
+                let url: &Url = table_path.as_ref();
+                let bucket_name = url
+                    .host_str()
+                    .ok_or(DataFusionError::External("invaild bucket 
name".into()))?;
+                let mut builder =
+                    AmazonS3Builder::from_env().with_bucket_name(bucket_name);
+
+                if let (Some(access_key_id), Some(secret_access_key)) = (
+                    cmd.options.get("access_key_id"),
+                    cmd.options.get("secret_access_key"),
+                ) {
+                    builder = builder
+                        .with_access_key_id(access_key_id)
+                        .with_secret_access_key(secret_access_key);
+                }
+
+                if let Some(session_token) = cmd.options.get("session_token") {
+                    builder = builder.with_token(session_token);
+                }
+
+                if let Some(region) = cmd.options.get("region") {
+                    builder = builder.with_region(region);
+                }
+
+                let store = Arc::new(builder.build()?);
+
+                state
+                    .runtime_env()
+                    .register_object_store(table_path.as_ref(), store);

Review Comment:
   I wonder if you can move this code for registering the object store 
dynamically out of core and into datafusion-cli ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to