jonahgao commented on code in PR #6722:
URL: https://github.com/apache/arrow-datafusion/pull/6722#discussion_r1238312184


##########
datafusion/core/src/datasource/mod.rs:
##########
@@ -199,3 +200,24 @@ fn get_col_stats(
         })
         .collect()
 }
+
+fn schema_eq_ignore_nullable(

Review Comment:
   @comphead I have tried two possible ways of using it and both have failed.
   1. Check `table_schema.contains(input_schema)`
   The failed query is :
     ```sql
     DataFusion CLI v26.0.0
     ❯ create table t(a int not null);
     0 rows in set. Query took 0.027 seconds.
     
     ❯ insert into t values(1);
     Error during planning: Inserting query must have the same schema with the 
table.
     ```
   
   2. Check `input_schema.contains(table_schema)`
   The failed query is :
   ```sql
   DataFusion CLI v26.0.0
   ❯ create table t2(a int null);
   0 rows in set. Query took 0.029 seconds.
   ❯ create table t3(a int not null);
   0 rows in set. Query took 0.002 seconds.
   
   ❯ insert into t2 select * from t3;
   Error during planning: Inserting query must have the same schema with the 
table.
   ```
   
   For an insertion operation, I think there is no need to have one schema be a 
superset of the other. 
   Instead, we should make sure that there is an intersection between the data 
sets defined by the two schemas.
   
   I've implemented the `equivalent_names_and_types` method for `Schema` in 
file `datafusion_common/dfschema.rs`. @alamb @comphead 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to