ttencate opened a new issue, #13287:
URL: https://github.com/apache/datafusion/issues/13287

   ### Describe the bug
   
   Follow up to #13092, which was fixed by #13117 thanks to @Omega359.
   
   However, this fix will not catch mistakes like reordered columns. For 
example, if table A has columns `a`, `b` and table B has columns `b`, `a`, then 
DataFusion will happily compute the union, with the wrong values in the wrong 
columns.
   
   So why not just compare the entire schema? Or at least the column names and 
types (i.e. ignoring metadata)? The docs explicitly say that the schemas must 
be equal.
   
   ### To Reproduce
   
   ```rust
   #[tokio::test]
   async fn test_union() {
       use crate::data_frame;
       use datafusion::assert_batches_sorted_eq;
       use datafusion::common::arrow::array::{ArrayRef, StringArray};
       use datafusion::common::arrow::record_batch::RecordBatch;
       use std::sync::Arc;
   
       let ctx = SessionContext::new();
       let a = ctx
           .read_batch(
               RecordBatch::try_from_iter([
                   ("a", Arc::new(StringArray::from(vec!["a"])) as ArrayRef),
                   ("b", Arc::new(StringArray::from(vec!["b"])) as ArrayRef),
               ])
               .unwrap(),
           )
           .unwrap();
       let b = ctx
           .read_batch(
               RecordBatch::try_from_iter([
                   ("b", Arc::new(StringArray::from(vec!["b"])) as ArrayRef),
                   ("a", Arc::new(StringArray::from(vec!["a"])) as ArrayRef),
               ])
               .unwrap(),
           )
           .unwrap();
   
       let union = a.union(b).unwrap();
       assert_batches_sorted_eq!(
           [
               "+---+---+",
               "| a | b |",
               "+---+---+",
               "| a | b |",
               "| a | b |",
               "+---+---+",
           ],
           &union.collect().await.unwrap()
       );
   }
   ```
   
   ### Expected behavior
   
   Test passes.
   
   ### Additional context
   
   Actual behavior:
   
   ```
   assertion `left == right` failed: 
   
   expected:
   
   [
       "+---+---+",
       "| a | b |",
       "+---+---+",
       "| a | b |",
       "| a | b |",
       "+---+---+",
   ]
   actual:
   
   [
       "+---+---+",
       "| a | b |",
       "+---+---+",
       "| a | b |",
       "| b | a |",
       "+---+---+",
   ]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to