alamb commented on code in PR #5732:
URL: https://github.com/apache/arrow-datafusion/pull/5732#discussion_r1153261483


##########
datafusion/core/src/datasource/listing_table_factory.rs:
##########
@@ -138,6 +142,73 @@ impl TableProviderFactory for ListingTableFactory {
             .with_file_sort_order(file_sort_order);
 
         let table_path = ListingTableUrl::parse(&cmd.location)?;
+
+        // try obtaining all relevant information of object store from 
cmd.options
+        match table_path.scheme() {
+            "s3" => {
+                let url: &Url = table_path.as_ref();
+                let bucket_name = url
+                    .host_str()
+                    .ok_or(DataFusionError::External("invaild bucket 
name".into()))?;
+                let mut builder =
+                    AmazonS3Builder::from_env().with_bucket_name(bucket_name);
+
+                if let (Some(access_key_id), Some(secret_access_key)) = (
+                    cmd.options.get("access_key_id"),
+                    cmd.options.get("secret_access_key"),
+                ) {
+                    builder = builder
+                        .with_access_key_id(access_key_id)
+                        .with_secret_access_key(secret_access_key);
+                }
+
+                if let Some(session_token) = cmd.options.get("session_token") {
+                    builder = builder.with_token(session_token);
+                }
+
+                if let Some(region) = cmd.options.get("region") {
+                    builder = builder.with_region(region);
+                }
+
+                let store = Arc::new(builder.build()?);
+
+                state
+                    .runtime_env()
+                    .register_object_store(table_path.as_ref(), store);

Review Comment:
   I think 
https://github.com/r4ntix/arrow-datafusion/commit/f9880465b3f2c00164422177d78622fcf7d334d0
 looks good too me -- if you want to get rid of the overhead of calling `sql` 
twice you could perhaps allow the dispatch code on SessionContext diectly.
   
   Like maybe add a 
   ```rust
   impl SessionContext {
       pub async fn sql(&self, sql: &str) -> Result<DataFrame> {
           // create a query planner
           let plan = self.state().create_logical_plan(sql).await?;
           self.execute_logical_plan(plan)?
   }
   
     // Execute a LogicalPlan (including command such as CreateExternalTable, 
etc)
     fn execute_logical_plan(&self, plan: LogicalPlan) -> Result<DataFrame> {
           match plan {
            ...
      }
   ```
   
   
https://docs.rs/datafusion/latest/src/datafusion/execution/context.rs.html#318-320



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to