[GitHub] [arrow-datafusion] comphead commented on a diff in pull request #6722: Fix inserting into a table with non-nullable columns

via GitHub Thu, 22 Jun 2023 08:35:30 -0700


comphead commented on code in PR #6722:
URL: https://github.com/apache/arrow-datafusion/pull/6722#discussion_r1238707176



##########
datafusion/core/src/physical_plan/insert.rs:
##########
@@ -219,3 +237,25 @@ fn make_count_schema() -> SchemaRef {
         false,
     )]))
 }
+
+fn check_batch(batch: RecordBatch, schema: &SchemaRef) -> Result<RecordBatch> {

Review Comment:
   Im not really sure about this method. That might be a performance hit if we 
scan data.
   I'd rather check metadata of incoming schema and expected schema like Spark 
does.



##########
datafusion/common/src/dfschema.rs:
##########
@@ -729,6 +729,34 @@ impl From<Field> for DFField {
     }
 }
 
+/// DataFusion-specific extensions to [`Schema`].
+pub trait SchemaExt {
+    /// This is a specialized version of Eq that ignores differences
+    /// in nullability and metadata.
+    ///
+    /// It works the same as [`DFSchema::equivalent_names_and_types`].
+    fn equivalent_names_and_types(&self, other: &Self) -> bool;
+}
+
+impl SchemaExt for Schema {
+    fn equivalent_names_and_types(&self, other: &Self) -> bool {
+        if self.fields().len() != other.fields().len() {
+            return false;
+        }
+
+        self.fields()
+            .iter()
+            .zip(other.fields().iter())
+            .all(|(f1, f2)| {
+                f1.name() == f2.name()
+                    && DFSchema::datatype_is_semantically_equal(

Review Comment:
   since we use `DFSchema` reference anyway, prob we can reuse 
`DFSchema::equivalent_names_and_types` without introducing new method. 🤔 
Moreover we have Schema <-> DFSchema converters in place.
   
   Otherwise we have to test this method thoroughly



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] comphead commented on a diff in pull request #6722: Fix inserting into a table with non-nullable columns

Reply via email to