orlandohohmeier opened a new issue, #8900:
URL: https://github.com/apache/arrow-datafusion/issues/8900
### Describe the bug
When running a query with a deeply nested filter expression the query fails
with stack overflow – the bug initially manifested as a `EXC_BAD_ACCESS` error
on macOS in our application. The problem is that the filter expression is
recursively normalized using `transform_up` which can cause stack overflows.
This probably also happens in other scenarios where one would end up with a
deeply nested tree.
Tested/Reproduced with:
version = "34.0.0"
macOS = 14.2.1
### To Reproduce
Minimal Reproducible Example:
```rust
use datafusion::arrow::array::Int64Array;
use datafusion::arrow::datatypes::DataType;
use datafusion::arrow::datatypes::Field;
use datafusion::arrow::datatypes::Schema;
use datafusion::arrow::record_batch::RecordBatch;
use datafusion::error::Result;
use datafusion::prelude::*;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
let ctx = SessionContext::new();
// Create a DataFusion DataFrame with columns a-z:
let batch = RecordBatch::try_new(
Arc::new(Schema::new(vec![Field::new("a", DataType::Int64, false)])),
vec![Arc::new(Int64Array::from(vec![
1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
]))],
)?;
let df = ctx.read_batch(batch)?;
let mut expr = col("a").eq(lit(1));
for _ in 0..1000 {
expr = expr.or(col("a").eq(lit(1)));
}
let df = df.filter(expr).unwrap();
df.show().await
}
```
### Expected behavior
The query should complete without errors, despite the _complexity_ of the
filter expression.
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]