[GitHub] [arrow-datafusion] byteink commented on issue #6492: Mismatch in MemTable (Select Into with aggregate window functions having no alias)

via GitHub Thu, 01 Jun 2023 20:13:41 -0700


byteink commented on issue #6492:
URL: 
https://github.com/apache/arrow-datafusion/issues/6492#issuecomment-1573076672


   Another similar case of failure:
   ```shell
   DataFusion CLI v25.0.0
   ❯ create table t (a int not null);
   0 rows in set. Query took 0.004 seconds.
   
   ❯ insert into t values(1);
   Error during planning: Inserting query must have the same schema with the 
table.
   ```
   Two non-matching schemas are：
   **input schema**: Schema { fields: [Field { name: "a", data_type: Int32, 
nullable: **true**, dict_id: 0, dict_is_ordered: false, metadata: {} }], 
metadata: {} }
   **table schema**: Schema { fields: [Field { name: "a", data_type: Int32, 
nullable: **false**, dict_id: 0, dict_is_ordered: false, metadata: {} }], 
metadata: {} }
     
       
        
   Can we ignore the schema part of batches and only focus on the actual data 
part?
   And use the function `RecordBatch::try_new` to check if the data in the 
RecordBatch matches the schema of the target table. 
   ```rust
   impl MemTable {
       /// Create a new in-memory table from the provided schema and record 
batches
       pub fn try_new(schema: SchemaRef, partitions: Vec<Vec<RecordBatch>>) -> 
Result<Self> {
           let mut batches = Vec::with_capacity(partitions.len());
           for partition in partitions {
               let new_partition = partition
                   .iter()
                   .map(|batch| {
                       RecordBatch::try_new(schema.clone(), 
batch.columns().to_vec())
                           .map_err(DataFusionError::ArrowError)
                   })
                   .collect::<Result<Vec<_>>>()?;
               batches.push(Arc::new(RwLock::new(new_partition)));
           }
           Ok(Self { schema, batches })
       }
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] byteink commented on issue #6492: Mismatch in MemTable (Select Into with aggregate window functions having no alias)

Reply via email to