Fang-Yu Rao created IMPALA-13467:
------------------------------------
Summary: test_min_max_filters() failed due to NullPointerException
Key: IMPALA-13467
URL: https://issues.apache.org/jira/browse/IMPALA-13467
Project: IMPALA
Issue Type: Bug
Affects Versions: Impala 4.5.0
Reporter: Fang-Yu Rao
Assignee: Peter Rozsa
We found that the following query in
[min_max_filters.test|https://github.com/apache/impala/blame/master/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test]
could fail due to NullPointerException.
{code:java}
---- QUERY
SET RUNTIME_FILTER_WAIT_TIME_MS=$RUNTIME_FILTER_WAIT_TIME_MS;
select * from functional_parquet.iceberg_partitioned i1,
functional_parquet.iceberg_partitioned i2
where i1.action = i2.action and
i1.id = i2.id and
i2.event_time = '2020-01-01 10:00:00';
---- RUNTIME_PROFILE
row_regex:.* RF00.\[min_max\] -> i1\.action.*
{code}
The stack trace was below.
{code:java}
I1018 18:26:21.967474 15092 Frontend.java:2190]
2449ca58b6c7b2c3:20e13eca00000000] Analyzing query: select * from
functional_parquet.iceberg_partitioned i1,
functional_parquet.iceberg_partitioned i2
where i1.action = i2.action and
i1.id = i2.id and
i2.event_time = '2020-01-01 10:00:00' db: functional_kudu
I1018 18:26:21.967491 15092 Frontend.java:2202]
2449ca58b6c7b2c3:20e13eca00000000] The original executor group sets from
executor membership snapshot: [TExecutorGroupSet(curr_num_ex
I1018 18:26:21.967509 15092 RequestPoolService.java:200]
2449ca58b6c7b2c3:20e13eca00000000] Default pool only, scheduler allocation is
not specified.
I1018 18:26:21.967532 15092 Frontend.java:2222]
2449ca58b6c7b2c3:20e13eca00000000] A total of 2 executor group sets to be
considered for auto-scaling: [TExecutorGroupSet(curr_num_ex
I1018 18:26:21.967546 15092 Frontend.java:2263]
2449ca58b6c7b2c3:20e13eca00000000] Consider executor group set:
TExecutorGroupSet(curr_num_executors:3, expected_num_executors:20, ex
I1018 18:26:21.968324 15092 AnalysisContext.java:521]
2449ca58b6c7b2c3:20e13eca00000000] Analysis took 0 ms
I1018 18:26:21.968353 15092 BaseAuthorizationChecker.java:114]
2449ca58b6c7b2c3:20e13eca00000000] Authorization check took 0 ms
I1018 18:26:21.968367 15092 Frontend.java:2599]
2449ca58b6c7b2c3:20e13eca00000000] Analysis and authorization finished.
I1018 18:26:21.968899 15092 IcebergScanPlanner.java:846]
2449ca58b6c7b2c3:20e13eca00000000] Push down the predicate:
ref(name="event_time") == 1577901600000000 to iceberg
I1018 18:26:21.969009 15092 SnapshotScan.java:124]
2449ca58b6c7b2c3:20e13eca00000000] Scanning table
hdfs://localhost:20500/test-warehouse/iceberg_test/iceberg_partitioned snapshot
I1018 18:26:21.969400 15092 LoggingMetricsReporter.java:38]
2449ca58b6c7b2c3:20e13eca00000000] Received metrics report:
ScanReport{tableName=hdfs://localhost:20500/test-warehouse/ic
I1018 18:26:21.969846 15092 jni-util.cc:321] 2449ca58b6c7b2c3:20e13eca00000000]
java.lang.NullPointerException
at
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:903)
at
org.apache.impala.planner.HdfsScanNode.initOverlapPredicate(HdfsScanNode.java:845)
at
org.apache.impala.planner.RuntimeFilterGenerator.assignRuntimeFilters(RuntimeFilterGenerator.java:1257)
at
org.apache.impala.planner.RuntimeFilterGenerator.generateFiltersRecursive(RuntimeFilterGenerator.java:1159)
at
org.apache.impala.planner.RuntimeFilterGenerator.generateFiltersRecursive(RuntimeFilterGenerator.java:1162)
at
org.apache.impala.planner.RuntimeFilterGenerator.generateFiltersRecursive(RuntimeFilterGenerator.java:1157)
at
org.apache.impala.planner.RuntimeFilterGenerator.generateFiltersRecursive(RuntimeFilterGenerator.java:1162)
at
org.apache.impala.planner.RuntimeFilterGenerator.generateFilters(RuntimeFilterGenerator.java:1091)
at
org.apache.impala.planner.RuntimeFilterGenerator.generateRuntimeFilters(RuntimeFilterGenerator.java:918)
at
org.apache.impala.planner.Planner.createPlanFragments(Planner.java:160)
at org.apache.impala.planner.Planner.createPlans(Planner.java:310)
at
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1969)
at
org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:2968)
at
org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:2730)
at
org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:2269)
at
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:2030)
at
org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:175)
{code}
We recently had changes in
[HdfsScanNode.java|https://github.com/apache/impala/blame/master/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java]
in IMPALA-12861 so maybe it's related.
The NullPointerException was thrown in initOverlapPredicate() of
[HdfsScanNode.java|https://github.com/apache/impala/blame/master/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java]
due to 'statsTuple_' being null.
{code:java}
public void initOverlapPredicate(Analyzer analyzer) {
if (!allParquet_) return;
Preconditions.checkNotNull(statsTuple_);
..
}
{code}
'stats' is written in computeStatsTupleAndConjuncts() of
[HdfsScanNode.java|https://github.com/apache/impala/blame/master/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java],
which in turn is called in init() of
[HdfsScanNode.java|https://github.com/apache/impala/blame/master/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java]
and will be called only if hasParquet(fileFormats_) or hasOrc(fileFormats_)
evaluate to true. I am wondering if it's possible that for some reason 'stats'
is not populated.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)