alamb commented on code in PR #8017:
URL: https://github.com/apache/arrow-datafusion/pull/8017#discussion_r1395710136
##########
datafusion/core/src/datasource/file_format/write/mod.rs:
##########
@@ -19,128 +19,32 @@
//! write support for the various file formats
use std::io::Error;
-use std::mem;
use std::pin::Pin;
use std::sync::Arc;
use std::task::{Context, Poll};
use crate::datasource::file_format::file_compression_type::FileCompressionType;
-use crate::datasource::physical_plan::FileMeta;
use crate::error::Result;
use arrow_array::RecordBatch;
-use datafusion_common::{exec_err, DataFusionError};
+use datafusion_common::DataFusionError;
use async_trait::async_trait;
use bytes::Bytes;
use futures::future::BoxFuture;
-use futures::ready;
-use futures::FutureExt;
use object_store::path::Path;
-use object_store::{MultipartId, ObjectMeta, ObjectStore};
+use object_store::{MultipartId, ObjectStore};
use tokio::io::AsyncWrite;
pub(crate) mod demux;
pub(crate) mod orchestration;
-/// `AsyncPutWriter` is an object that facilitates asynchronous writing to
object stores.
-/// It is specifically designed for the `object_store` crate's `put` method
and sends
-/// whole bytes at once when the buffer is flushed.
-pub struct AsyncPutWriter {
Review Comment:
Indeed I double checked and it appears to be dead code (unused and
untested):
https://github.com/search?q=repo%3Aapache%2Farrow-datafusion%20AsyncPutWriter&type=code
<img width="1570" alt="Screenshot 2023-11-16 at 8 40 03 AM"
src="https://github.com/apache/arrow-datafusion/assets/490673/1f7e0dd1-3d19-4665-a051-abb2dd6af297">
##########
datafusion/core/src/datasource/listing/table.rs:
##########
@@ -214,33 +214,6 @@ impl ListingTableConfig {
}
}
-#[derive(Debug, Clone)]
-///controls how new data should be inserted to a ListingTable
-pub enum ListingTableInsertMode {
- ///Data should be appended to an existing file
- AppendToFile,
- ///Data is appended as new files in existing TablePaths
- AppendNewFiles,
- ///Throw an error if insert into is attempted on this table
- Error,
Review Comment:
It would be good to hear about a usecase of appending to a set of existing
files -- it is not a usecase I am familiar with but also my background is in
analytic database where it is more common to make new files (and then
"compact/rewrite" them as subsequent step) rather than modifying existing files
##########
datafusion/sqllogictest/test_files/insert_to_external.slt:
##########
@@ -267,8 +267,6 @@ INSERT INTO single_file_test values (4, 5), (6, 7);
query II
select * from single_file_test;
----
-1 2
Review Comment:
I think it should be an error. It would also be really good to call out how
to get "append" semantics (which I believe requires using `CREATE UNBOUNDED
TABLE ...`)?
##########
datafusion/core/src/datasource/file_format/write/mod.rs:
##########
@@ -163,45 +67,28 @@ impl MultiPart {
}
}
-pub(crate) enum AbortMode {
- Put,
- Append,
- MultiPart(MultiPart),
-}
-
/// A wrapper struct with abort method and writer
pub(crate) struct AbortableWrite<W: AsyncWrite + Unpin + Send> {
writer: W,
- mode: AbortMode,
+ multipart: MultiPart,
}
impl<W: AsyncWrite + Unpin + Send> AbortableWrite<W> {
/// Create a new `AbortableWrite` instance with the given writer, and
write mode.
Review Comment:
the doc comments may also need updating
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]