devinjdangelo commented on code in PR #8667:
URL: https://github.com/apache/arrow-datafusion/pull/8667#discussion_r1437753721


##########
datafusion/common/src/file_options/mod.rs:
##########
@@ -154,6 +154,10 @@ pub enum FileTypeWriterOptions {
     JSON(JsonWriterOptions),
     Avro(AvroWriterOptions),
     Arrow(ArrowWriterOptions),
+    /// For extension [FileType]s, FileTypeWriterOptions simply stores
+    /// any passed StatementOptions to be handled later by any custom
+    /// physical plans (Such as a FileFormat::create_writer_physical_plan)
+    Extension(Option<StatementOptions>),

Review Comment:
   FileTypeWriterOptions does not handle any parsing or validation for 
externally defined FileTypes. It simply acts as a container for the options 
passed in.



##########
datafusion/common/src/file_options/file_type.rs:
##########
@@ -41,7 +43,8 @@ pub trait GetExt {
 }
 
 /// Readable file type
-#[derive(Debug, Clone, PartialEq, Eq, Hash)]
+#[allow(clippy::derived_hash_with_manual_eq)]

Review Comment:
   I think the manual and auto traits should be compatible, so silencing this 
is OK.



##########
datafusion/proto/src/physical_plan/to_proto.rs:
##########
@@ -891,6 +891,9 @@ impl TryFrom<&FileTypeWriterOptions> for 
protobuf::FileTypeWriterOptions {
             FileTypeWriterOptions::Arrow(ArrowWriterOptions {}) => {
                 return not_impl_err!("Arrow file sink protobuf serialization")
             }
+            FileTypeWriterOptions::Extension(_) => {

Review Comment:
   We should be able to support serializing this since it is ultimately just a 
Vec<(String, String)>



##########
datafusion/core/src/physical_planner.rs:
##########
@@ -605,6 +605,7 @@ impl DefaultPhysicalPlanner {
                         FileType::JSON => Arc::new(JsonFormat::default()),
                         FileType::AVRO => Arc::new(AvroFormat {} ),
                         FileType::ARROW => Arc::new(ArrowFormat {}),
+                        FileType::Extension(_ext) => return 
not_impl_err!("Extension FileTypes not supported in Copy To statements.")

Review Comment:
   The built in Copy plans can't support external types (since we can't 
initialize them in DataFusion), but it should be not too difficult to build a 
custom COPY PhysicalPlan that uses a custom FileFormat, but otherwise is the 
same and relies on the build in Copy logical plan. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to