tobixdev opened a new issue, #18870:
URL: https://github.com/apache/datafusion/issues/18870

   ### Describe the bug
   
   Executing the query plan from below causes the following crash:
   
   ```
   thread 'test::test_nested_join_fixed_size_binary' (33301) panicked at 
/home/tschwarzinger/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-buffer-57.0.0/src/buffer/immutable.rs:300:9:
   the offset of the new Buffer cannot exceed the existing length: slice 
offset=0 length=8 selflen=0
   ```
   
   Interestingly, the crash does not occur if we remove the [`None`] values 
from the reproducer. I.e., using
   
   ```
   ctx.register_table("t1", fsb_table("left", vec![Some(b"0001")]))?;
   ctx.register_table("t2", fsb_table("right", vec![Some(b"0001")]))?;
   ```
   
   does not crash.
   
   ### To Reproduce
   
   ```rust
   use anyhow::Result;
   use datafusion::arrow::array::{ArrayRef, FixedSizeBinaryArray, RecordBatch};
   use datafusion::arrow::datatypes::{DataType, Field, Schema};
   use datafusion::datasource::MemTable;
   use datafusion::execution::context::SessionContext;
   use datafusion::prelude::*;
   use std::sync::Arc;
   
   /// Build a FixedSizeBinary(4) array from byte slices.
   fn fsb(values: &[Option<&[u8; 4]>]) -> ArrayRef {
       let arr = FixedSizeBinaryArray::from(
           values
               .iter()
               .map(|o| o.map(|x| x.as_slice()))
               .collect::<Vec<_>>(),
       );
       Arc::new(arr)
   }
   
   /// Create a MemTable with a single FixedSizeBinary(4) column
   fn fsb_table(col_name: &str, data: Vec<Option<&[u8; 4]>>) -> Arc<MemTable> {
       let schema = Arc::new(Schema::new(vec![Field::new(
           col_name,
           DataType::FixedSizeBinary(4),
           true,
       )]));
   
       let batch = RecordBatch::try_new(schema.clone(), 
vec![fsb(&data)]).unwrap();
   
       Arc::new(MemTable::try_new(schema, vec![vec![batch]]).unwrap())
   }
   
   #[tokio::test]
   async fn test_nested_join_fixed_size_binary() -> Result<()> {
       let ctx = SessionContext::new();
   
       ctx.register_table("t1", fsb_table("left", vec![Some(b"0001"), None]))?;
       ctx.register_table("t2", fsb_table("right", vec![Some(b"0001"), None]))?;
   
       let df = ctx.table("t1").await?.join(
           ctx.table("t2").await?,
           JoinType::Left,
           &[],
           &[],
           Some(lit(true)),
       )?;
   
       assert_eq!(
           df.to_string().await.unwrap(),
           "Join: left join (t1#0 = t2#0)"
       );
   
       Ok(())
   }
   ```
   
   ### Expected behavior
   
   Not crashing and producing a result.
   
   ### Additional context
   
   The Query Plan:
   
   ```
   +---------------+-----------------------------------------------------+
   | plan_type     | plan                                                |
   +---------------+-----------------------------------------------------+
   | logical_plan  | Left Join:                                          |
   |               |   TableScan: t1 projection=[left]                   |
   |               |   TableScan: t2 projection=[right]                  |
   | physical_plan | NestedLoopJoinExec: join_type=Left                  |
   |               |   DataSourceExec: partitions=1, partition_sizes=[1] |
   |               |   DataSourceExec: partitions=1, partition_sizes=[1] |
   |               |                                                     |
   +---------------+-----------------------------------------------------+
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to