This is an automated email from the ASF dual-hosted git repository.
liurenjie1024 pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iceberg-rust.git
The following commit(s) were added to refs/heads/main by this push:
new f7bffb1 Support identifier warehouses (#308)
f7bffb1 is described below
commit f7bffb12dbf1bd25125aee58fd1518c26112956f
Author: Fokko Driesprong <[email protected]>
AuthorDate: Fri Apr 5 08:12:26 2024 +0200
Support identifier warehouses (#308)
* Support identifier warehouses
This is a bit confusing if you come from a Hive background
where the warehouse is always a path to hdfs/s3/etc.
With the REST catalog, the warehouse can also be a logical
identifier:
https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L72-L78
This means that we have to make sure that we only parse paths
that are an actual path, and not an identifier.
I'm open to suggestions. The check is now very simple, but can
be extended for example using a regex. But I'm not sure what
the implications are of importing additional packages (in Python
you want to keep it as lightweight as possible).
* Use `if Url::parse().is_ok()`
---
crates/catalog/rest/src/catalog.rs | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/crates/catalog/rest/src/catalog.rs
b/crates/catalog/rest/src/catalog.rs
index e7c773c..efea9cb 100644
--- a/crates/catalog/rest/src/catalog.rs
+++ b/crates/catalog/rest/src/catalog.rs
@@ -23,7 +23,7 @@ use std::str::FromStr;
use async_trait::async_trait;
use itertools::Itertools;
use reqwest::header::{self, HeaderMap, HeaderName, HeaderValue};
-use reqwest::{Client, Request, Response, StatusCode};
+use reqwest::{Client, Request, Response, StatusCode, Url};
use serde::de::DeserializeOwned;
use typed_builder::TypedBuilder;
use urlencoding::encode;
@@ -650,7 +650,15 @@ impl RestCatalog {
props.extend(config);
}
- let file_io = match
self.config.warehouse.as_deref().or(metadata_location) {
+ // If the warehouse is a logical identifier instead of a URL we don't
want
+ // to raise an exception
+ let warehouse_path = match self.config.warehouse.as_deref() {
+ Some(url) if Url::parse(url).is_ok() => Some(url),
+ Some(_) => None,
+ None => None,
+ };
+
+ let file_io = match warehouse_path.or(metadata_location) {
Some(url) => FileIO::from_path(url)?.with_props(props).build()?,
None => {
return Err(Error::new(