Fang-Yu Rao created IMPALA-13467:
------------------------------------

             Summary: test_min_max_filters() failed due to NullPointerException
                 Key: IMPALA-13467
                 URL: https://issues.apache.org/jira/browse/IMPALA-13467
             Project: IMPALA
          Issue Type: Bug
    Affects Versions: Impala 4.5.0
            Reporter: Fang-Yu Rao
            Assignee: Peter Rozsa


We found that the following query in 
[min_max_filters.test|https://github.com/apache/impala/blame/master/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test]
 could fail due to NullPointerException.
{code:java}
---- QUERY
SET RUNTIME_FILTER_WAIT_TIME_MS=$RUNTIME_FILTER_WAIT_TIME_MS;
select * from functional_parquet.iceberg_partitioned i1,
              functional_parquet.iceberg_partitioned i2
where i1.action = i2.action and
      i1.id = i2.id and
      i2.event_time = '2020-01-01 10:00:00';
---- RUNTIME_PROFILE
row_regex:.* RF00.\[min_max\] -> i1\.action.*
{code}
The stack trace was below.
{code:java}
I1018 18:26:21.967474 15092 Frontend.java:2190] 
2449ca58b6c7b2c3:20e13eca00000000] Analyzing query: select * from 
functional_parquet.iceberg_partitioned i1,
              functional_parquet.iceberg_partitioned i2
where i1.action = i2.action and
      i1.id = i2.id and
      i2.event_time = '2020-01-01 10:00:00' db: functional_kudu
I1018 18:26:21.967491 15092 Frontend.java:2202] 
2449ca58b6c7b2c3:20e13eca00000000] The original executor group sets from 
executor membership snapshot: [TExecutorGroupSet(curr_num_ex
I1018 18:26:21.967509 15092 RequestPoolService.java:200] 
2449ca58b6c7b2c3:20e13eca00000000] Default pool only, scheduler allocation is 
not specified.
I1018 18:26:21.967532 15092 Frontend.java:2222] 
2449ca58b6c7b2c3:20e13eca00000000] A total of 2 executor group sets to be 
considered for auto-scaling: [TExecutorGroupSet(curr_num_ex
I1018 18:26:21.967546 15092 Frontend.java:2263] 
2449ca58b6c7b2c3:20e13eca00000000] Consider executor group set: 
TExecutorGroupSet(curr_num_executors:3, expected_num_executors:20, ex
I1018 18:26:21.968324 15092 AnalysisContext.java:521] 
2449ca58b6c7b2c3:20e13eca00000000] Analysis took 0 ms
I1018 18:26:21.968353 15092 BaseAuthorizationChecker.java:114] 
2449ca58b6c7b2c3:20e13eca00000000] Authorization check took 0 ms
I1018 18:26:21.968367 15092 Frontend.java:2599] 
2449ca58b6c7b2c3:20e13eca00000000] Analysis and authorization finished.
I1018 18:26:21.968899 15092 IcebergScanPlanner.java:846] 
2449ca58b6c7b2c3:20e13eca00000000] Push down the predicate: 
ref(name="event_time") == 1577901600000000 to iceberg
I1018 18:26:21.969009 15092 SnapshotScan.java:124] 
2449ca58b6c7b2c3:20e13eca00000000] Scanning table 
hdfs://localhost:20500/test-warehouse/iceberg_test/iceberg_partitioned snapshot
I1018 18:26:21.969400 15092 LoggingMetricsReporter.java:38] 
2449ca58b6c7b2c3:20e13eca00000000] Received metrics report: 
ScanReport{tableName=hdfs://localhost:20500/test-warehouse/ic
I1018 18:26:21.969846 15092 jni-util.cc:321] 2449ca58b6c7b2c3:20e13eca00000000] 
java.lang.NullPointerException
        at 
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:903)
        at 
org.apache.impala.planner.HdfsScanNode.initOverlapPredicate(HdfsScanNode.java:845)
        at 
org.apache.impala.planner.RuntimeFilterGenerator.assignRuntimeFilters(RuntimeFilterGenerator.java:1257)
        at 
org.apache.impala.planner.RuntimeFilterGenerator.generateFiltersRecursive(RuntimeFilterGenerator.java:1159)
        at 
org.apache.impala.planner.RuntimeFilterGenerator.generateFiltersRecursive(RuntimeFilterGenerator.java:1162)
        at 
org.apache.impala.planner.RuntimeFilterGenerator.generateFiltersRecursive(RuntimeFilterGenerator.java:1157)
        at 
org.apache.impala.planner.RuntimeFilterGenerator.generateFiltersRecursive(RuntimeFilterGenerator.java:1162)
        at 
org.apache.impala.planner.RuntimeFilterGenerator.generateFilters(RuntimeFilterGenerator.java:1091)
        at 
org.apache.impala.planner.RuntimeFilterGenerator.generateRuntimeFilters(RuntimeFilterGenerator.java:918)
        at 
org.apache.impala.planner.Planner.createPlanFragments(Planner.java:160)
        at org.apache.impala.planner.Planner.createPlans(Planner.java:310)
        at 
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1969)
        at 
org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:2968)
        at 
org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:2730)
        at 
org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:2269)
        at 
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:2030)
        at 
org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:175)
{code}
 

We recently had changes in 
[HdfsScanNode.java|https://github.com/apache/impala/blame/master/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java]
 in IMPALA-12861 so maybe it's related.

 

The NullPointerException was thrown in initOverlapPredicate() of 
[HdfsScanNode.java|https://github.com/apache/impala/blame/master/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java]
 due to 'statsTuple_' being null.
{code:java}
  public void initOverlapPredicate(Analyzer analyzer) {
    if (!allParquet_) return;
    Preconditions.checkNotNull(statsTuple_);
  ..
  }
{code}
'stats' is written in computeStatsTupleAndConjuncts() of 
[HdfsScanNode.java|https://github.com/apache/impala/blame/master/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java],
 which in turn is called in init() of 
[HdfsScanNode.java|https://github.com/apache/impala/blame/master/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java]
 and will be called only if hasParquet(fileFormats_) or hasOrc(fileFormats_) 
evaluate to true. I am wondering if it's possible that for some reason 'stats' 
is not populated.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to