luoyuxia commented on code in PR #192:
URL: https://github.com/apache/paimon-rust/pull/192#discussion_r3032729061
##########
crates/paimon/src/io/file_io.rs:
##########
@@ -65,11 +65,13 @@ impl FileIO {
})?;
Ok(FileIOBuilder::new(url.scheme()))
}
+}
+// Implement FileIOProvider for FileIO
+use crate::io::FileIOProvider;
- /// Create a new input file to read data.
- ///
- /// Reference:
<https://github.com/apache/paimon/blob/release-0.8.2/paimon-common/src/main/java/org/apache/paimon/fs/FileIO.java#L76>
- pub fn new_input(&self, path: &str) -> crate::Result<InputFile> {
+#[async_trait::async_trait]
+impl FileIOProvider for FileIO {
Review Comment:
I'm wondering whether we can refactor as this:
1: Make FileIO a trait:
```
pub trait FileIO: Send + Sync + Debug {
async fn new_input(&self, path: &str) -> Result<InputFile>;
async fn new_output(&self, path: &str) -> Result<OutputFile>;
async fn get_status(&self, path: &str) -> Result<FileStatus>;
async fn list_status(&self, path: &str) -> Result<Vec<FileStatus>>;
async fn exists(&self, path: &str) -> Result<bool>;
async fn delete_file(&self, path: &str) -> Result<()>;
async fn delete_dir(&self, path: &str) -> Result<()>;
async fn mkdirs(&self, path: &str) -> Result<()>;
async fn rename(&self, src: &str, dst: &str) -> Result<()>;
}
```
2: Have a default/static file io
```
#[derive(Clone, Debug)]
pub struct DefaultFileIO {
storage: Arc<Storage>,
}
impl DefaultFileIO {
pub(crate) fn new(storage: Storage) -> Self {
Self {
storage: Arc::new(storage),
}
}
}
#[async_trait::async_trait]
impl FileIO for DefaultFileIO {
async fn new_input(&self, path: &str) -> Result<InputFile> { ... }
async fn new_output(&self, path: &str) -> Result<OutputFile> { ... }
async fn get_status(&self, path: &str) -> Result<FileStatus> { ... }
async fn list_status(&self, path: &str) -> Result<Vec<FileStatus>> {
... }
async fn exists(&self, path: &str) -> Result<bool> { ... }
async fn delete_file(&self, path: &str) -> Result<()> { ... }
async fn delete_dir(&self, path: &str) -> Result<()> { ... }
async fn mkdirs(&self, path: &str) -> Result<()> { ... }
async fn rename(&self, src: &str, dst: &str) -> Result<()> { ... }
}
```
3: Have a RESTTokenFileIO file io which will load file io dynamically
according to token refresh
```
#[derive(Debug)]
pub struct RESTTokenFileIO {
identifier: Identifier,
path: String,
catalog_options: Options,
api: OnceCell<RESTApi>,
token: RwLock<Option<RESTToken>>,
refresh_lock: Mutex<()>,
}
imple RESTTokenFileIO {
// in here, we can load file io dynamically
}
impl FileIO for RESTTokenFileIO {
// todo
}
```
4: optional:
pub type FileIORef = Arc<dyn FileIO>;
in rust, we tend to type `Arc<dyn xxx>`as `xxxRef`. It's optinal, but may
make code simple.
Since it's a little big of change, maybe we need to reach consensus with
@JingsongLi before move on.
##########
crates/paimon/src/io/file_io.rs:
##########
@@ -113,16 +109,9 @@ impl FileIO {
})
}
- /// List the statuses of the files/directories in the given path if the
path is a directory.
- ///
- /// References:
<https://github.com/apache/paimon/blob/release-0.8.2/paimon-common/src/main/java/org/apache/paimon/fs/FileIO.java#L105>
- ///
- /// FIXME: how to handle large dir? Better to return a stream instead?
- pub async fn list_status(&self, path: &str) -> Result<Vec<FileStatus>> {
+ async fn list_status(&self, path: &str) -> Result<Vec<FileStatus>> {
let (op, relative_path) = self.storage.create(path)?;
let base_path = &path[..path.len() - relative_path.len()];
- // Opendal list() expects directory path to end with `/`.
Review Comment:
why remve the comments? I think the comments are usefull.
The same for others.
##########
crates/paimon/src/io/file_io.rs:
##########
@@ -65,11 +65,13 @@ impl FileIO {
})?;
Ok(FileIOBuilder::new(url.scheme()))
}
+}
+// Implement FileIOProvider for FileIO
+use crate::io::FileIOProvider;
- /// Create a new input file to read data.
- ///
- /// Reference:
<https://github.com/apache/paimon/blob/release-0.8.2/paimon-common/src/main/java/org/apache/paimon/fs/FileIO.java#L76>
- pub fn new_input(&self, path: &str) -> crate::Result<InputFile> {
+#[async_trait::async_trait]
+impl FileIOProvider for FileIO {
Review Comment:
It's strange to me that FileIO implement FileIO**Provider**..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]