alamb commented on code in PR #13866:
URL: https://github.com/apache/datafusion/pull/13866#discussion_r1899464378


##########
datafusion/core/src/datasource/file_format/parquet.rs:
##########
@@ -750,6 +749,28 @@ impl ParquetSink {
         }
     }
 
+    /// Create writer properties based upon configuration settings,
+    /// including partitioning and the inclusion of arrow schema metadata.
+    fn create_writer_props(&self) -> Result<WriterProperties> {
+        let schema = if 
self.parquet_options.global.allow_single_file_parallelism {
+            // If parallelizing writes, we may be also be doing hive style 
partitioning
+            // into multiple files which impacts the schema per file.
+            // Refer to `self.get_writer_schema()`
+            &self.get_writer_schema()
+        } else {
+            self.config.output_schema()
+        };
+
+        // TODO: avoid this clone in follow up PR, where the writer properties 
& schema

Review Comment:
   Is this a PR you plan to do ? Should I file a ticket to track?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to