adriangb commented on issue #17510: URL: https://github.com/apache/datafusion/issues/17510#issuecomment-3281271880
I don't think this has anything to do with dynamic filters. I reduced the MRE to: ```rust #[tokio::test] async fn test_regression() { use arrow::{ array::{RecordBatch, UInt32Array}, datatypes::{DataType, Field, Schema}, }; use datafusion::datasource::memory::MemTable; use datafusion::prelude::*; use std::sync::Arc; // Create simple test data let schema = Arc::new(Schema::new(vec![Field::new("id", DataType::UInt32, false)])); let batch = RecordBatch::try_new( schema.clone(), vec![Arc::new(UInt32Array::from((0..20).collect::<Vec<u32>>()))], ).unwrap(); let cfg = SessionConfig::new().set_bool("datafusion.optimizer.enable_dynamic_filter_pushdown", false); let ctx = SessionContext::new_with_config(cfg); let provider = MemTable::try_new(schema.clone(), vec![vec![batch]]).unwrap(); ctx.register_table("test", Arc::new(provider)).unwrap(); // Create a simple plan with a sort and a limit. let df = ctx .table("test") .await.unwrap() .sort(vec![col("id").sort(true, true)]).unwrap() .limit(0, Some(10)).unwrap(); let logical_plan = df.into_optimized_plan().unwrap(); let plan_1 = ctx.state().create_physical_plan(&logical_plan).await.unwrap(); let plan_2 = plan_1.clone(); let task_ctx = ctx.task_ctx(); let stream_1 = plan_1.execute(0, task_ctx.clone()).unwrap(); let batches_1 = datafusion::physical_plan::common::collect(stream_1).await.unwrap(); let count_1: usize = batches_1.iter().map(|b| b.num_rows()).sum(); // let plan_2 = plan_2.reset_state().unwrap(); let stream_2 = plan_2.execute(0, task_ctx).unwrap(); let batches_2 = datafusion::physical_plan::common::collect(stream_2).await.unwrap(); let count_2: usize = batches_2.iter().map(|b| b.num_rows()).sum(); // Result mismatch on DF 49 println!("Expected: 10 rows (limit in query)"); assert_eq!(count_2, count_1, "Row count mismatch: expected {count_1}, got {count_2}"); } ``` (I tested by adding this to the bottom of datafusion/core/tests/physical_optimizer/filter_pushdown/mod.rs) If you comment back in the line `// let plan_2 = plan_2.reset_state().unwrap();` then it passes. As you can see I set `datafusion.optimizer.enable_dynamic_filter_pushdown = false` which should disable dynamic filters, i.e. this is unrelated to that feature. Could it be that you are just observing the state of the `SortExec` being re-used across runs? Where did you call `reset_state()`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org