haohuaijin opened a new issue, #20742:
URL: https://github.com/apache/datafusion/issues/20742

   ### Describe the bug
   
   When two FilterExec nodes are stacked and the inner filter proves zero 
selectivity (no rows can match), the outer filter panics during interval 
analysis.
   
   Root cause: When a FilterExec determines that no rows can pass its predicate 
(e.g., a > 200 when a's max is 100), collect_new_statistics produced column 
statistics with untyped ScalarValue::Null for min/max/sum values. The Null 
variant has data type Null.
   
   If an outer FilterExec sits on top and tries to analyze its own predicate 
(e.g., a = 50), it attempts to intersect intervals from the inner filter's 
statistics (Null type) with the literal in its predicate (Int32 type). 
Interval::intersect requires both sides to have the same data type, so it 
panics with:
   
   "Only intervals with the same data type are intersectable, lhs:Null, 
rhs:Int32"
   
   ### To Reproduce
   
   ```rust
       #[tokio::test]
       async fn test_nested_filter_with_zero_selectivity_inner() -> Result<()> {
           // Inner table: a: [1, 100], b: [1, 3]
           let schema = Schema::new(vec![
               Field::new("a", DataType::Int32, false),
               Field::new("b", DataType::Int32, false),
           ]);
           let input = Arc::new(StatisticsExec::new(
               Statistics {
                   num_rows: Precision::Inexact(1000),
                   total_byte_size: Precision::Inexact(4000),
                   column_statistics: vec![
                       ColumnStatistics {
                           min_value: 
Precision::Inexact(ScalarValue::Int32(Some(1))),
                           max_value: 
Precision::Inexact(ScalarValue::Int32(Some(100))),
                           ..Default::default()
                       },
                       ColumnStatistics {
                           min_value: 
Precision::Inexact(ScalarValue::Int32(Some(1))),
                           max_value: 
Precision::Inexact(ScalarValue::Int32(Some(3))),
                           ..Default::default()
                       },
                   ],
               },
               schema,
           ));
   
           // Inner filter: a > 200 (impossible given a max=100 → zero 
selectivity)
           let inner_predicate: Arc<dyn PhysicalExpr> = 
Arc::new(BinaryExpr::new(
               Arc::new(Column::new("a", 0)),
               Operator::Gt,
               Arc::new(Literal::new(ScalarValue::Int32(Some(200)))),
           ));
           let inner_filter: Arc<dyn ExecutionPlan> =
               Arc::new(FilterExec::try_new(inner_predicate, input)?);
   
           let outer_predicate: Arc<dyn PhysicalExpr> = 
Arc::new(BinaryExpr::new(
               Arc::new(Column::new("a", 0)),
               Operator::Eq,
               Arc::new(Literal::new(ScalarValue::Int32(Some(50)))),
           ));
           let outer_filter: Arc<dyn ExecutionPlan> =
               Arc::new(FilterExec::try_new(outer_predicate, inner_filter)?);
   
           let statistics = outer_filter.partition_statistics(None)?;
           assert_eq!(statistics.num_rows, Precision::Inexact(0));
   
           Ok(())
       }
   
   ```
   
   ### Expected behavior
   
   work without panic
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to