Xuanwo commented on code in PR #5835:
URL: https://github.com/apache/arrow-rs/pull/5835#discussion_r1633269842


##########
object_store/src/buffered.rs:
##########
@@ -400,6 +400,125 @@ impl AsyncWrite for BufWriter {
     }
 }
 
+/// A buffered multipart uploader.
+///
+/// This uploader adaptively uses [`ObjectStore::put`] or
+/// [`ObjectStore::put_multipart`] depending on the amount of data that has
+/// been written.
+///
+/// Up to `capacity` bytes will be buffered in memory, and flushed on shutdown
+/// using [`ObjectStore::put`]. If `capacity` is exceeded, data will instead be
+/// streamed using [`ObjectStore::put_multipart`]
+///
+/// # TODO
+///
+/// Add attributes and tags support.
+pub struct BufUploader {
+    store: Arc<dyn ObjectStore>,
+    path: Path,
+
+    chunk_size: usize,
+    max_concurrency: usize,
+
+    buffer: PutPayloadMut,
+    write_multipart: Option<WriteMultipart>,
+}
+
+impl std::fmt::Debug for BufUploader {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        f.debug_struct("BufUploader")
+            .field("chunk_size", &self.chunk_size)
+            .finish()
+    }
+}
+
+impl BufUploader {
+    /// Create a new [`BufUploader`] from the provided [`ObjectStore`] and 
[`Path`]
+    pub fn new(store: Arc<dyn ObjectStore>, path: Path) -> Self {
+        Self::with_chunk_size(store, path, 5 * 1024 * 1024)
+    }
+
+    /// Create a new [`BufUploader`] from the provided [`ObjectStore`], 
[`Path`] and `capacity`

Review Comment:
   I try to keep the consistent API with existing `WriteMultipart`:
   
   
https://github.com/apache/arrow-rs/blob/087f34b70e97ee85e1a54b3c45c5ed814f500b0a/object_store/src/upload.rs#L121-L129
   
   And yes, we can't change the chunk size after created a `BufUploader`. I 
will add a comment here. 
   
   ---
   
   Or maybe we can make `chunk_size` a required field for `BufUploader::new`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to