yahoNanJing commented on code in PR #2906:
URL: https://github.com/apache/arrow-datafusion/pull/2906#discussion_r922758281


##########
datafusion/core/src/datasource/object_store.rs:
##########
@@ -132,19 +149,51 @@ impl ObjectStoreRegistry {
     ///
     /// - URL with scheme `file:///` or no schema will return the default 
LocalFS store
     /// - URL with scheme `s3://bucket/` will return the S3 store if it's 
registered
+    /// - URL with scheme `hdfs://hostname:port` will return the hdfs store if 
it's registered
     ///
     pub fn get_by_url(&self, url: impl AsRef<Url>) -> Result<Arc<dyn 
ObjectStore>> {
         let url = url.as_ref();
-        let s = &url[url::Position::BeforeScheme..url::Position::AfterHost];
-        let stores = self.object_stores.read();
-        let store = stores.get(s).ok_or_else(|| {
-            DataFusionError::Internal(format!(
-                "No suitable object store found for {}",
-                url
-            ))
-        })?;
-
-        Ok(store.clone())
+        // First check whether can get object store from registry
+        let store = {
+            let stores = self.object_stores.read();
+            let s = 
&url[url::Position::BeforeScheme..url::Position::BeforeHost];

Review Comment:
   In general, we should not store the users' credential info in memory. The 
security topic related to many points:
   - The expiration mechanism
   - The encoding algorithm
   - The encryption algorithm
   - ...
   
   However, this PR is not for dealing with it. If anyone is interested in it, 
we should open another issue. For HDFS, we leverage Kerberos technique to deal 
with security issue. However, I have little knowledge for S3. My intuition 
tells me that it's not safe to put the credential info in each request, 
although it's base64 encoded.
   
   For this PR, it's OK to just use 
*url::Position::BeforeScheme..url::Position::BeforePath* to extract the key for 
one specific object store. And I'll submit a related commit to fix it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to