This is an automated email from the ASF dual-hosted git repository.

liurenjie1024 pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iceberg-rust.git


The following commit(s) were added to refs/heads/main by this push:
     new f7bffb1  Support identifier warehouses (#308)
f7bffb1 is described below

commit f7bffb12dbf1bd25125aee58fd1518c26112956f
Author: Fokko Driesprong <[email protected]>
AuthorDate: Fri Apr 5 08:12:26 2024 +0200

    Support identifier warehouses (#308)
    
    * Support identifier warehouses
    
    This is a bit confusing if you come from a Hive background
    where the warehouse is always a path to hdfs/s3/etc.
    
    With the REST catalog, the warehouse can also be a logical
    identifier:
    
https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L72-L78
    
    This means that we have to make sure that we only parse paths
    that are an actual path, and not an identifier.
    
    I'm open to suggestions. The check is now very simple, but can
    be extended for example using a regex. But I'm not sure what
    the implications are of importing additional packages (in Python
    you want to keep it as lightweight as possible).
    
    * Use `if Url::parse().is_ok()`
---
 crates/catalog/rest/src/catalog.rs | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/crates/catalog/rest/src/catalog.rs 
b/crates/catalog/rest/src/catalog.rs
index e7c773c..efea9cb 100644
--- a/crates/catalog/rest/src/catalog.rs
+++ b/crates/catalog/rest/src/catalog.rs
@@ -23,7 +23,7 @@ use std::str::FromStr;
 use async_trait::async_trait;
 use itertools::Itertools;
 use reqwest::header::{self, HeaderMap, HeaderName, HeaderValue};
-use reqwest::{Client, Request, Response, StatusCode};
+use reqwest::{Client, Request, Response, StatusCode, Url};
 use serde::de::DeserializeOwned;
 use typed_builder::TypedBuilder;
 use urlencoding::encode;
@@ -650,7 +650,15 @@ impl RestCatalog {
             props.extend(config);
         }
 
-        let file_io = match 
self.config.warehouse.as_deref().or(metadata_location) {
+        // If the warehouse is a logical identifier instead of a URL we don't 
want
+        // to raise an exception
+        let warehouse_path = match self.config.warehouse.as_deref() {
+            Some(url) if Url::parse(url).is_ok() => Some(url),
+            Some(_) => None,
+            None => None,
+        };
+
+        let file_io = match warehouse_path.or(metadata_location) {
             Some(url) => FileIO::from_path(url)?.with_props(props).build()?,
             None => {
                 return Err(Error::new(

Reply via email to