Hello Balazs Hevele, Zoltan Borok-Nagy, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/24464
to look at the new patch set (#2).
Change subject: IMPALA-15076: Fix JDBC scan duplication in UNION ALL and
ResultSet race
......................................................................
IMPALA-15076: Fix JDBC scan duplication in UNION ALL and ResultSet race
After IMPALA-14523 patch (shared JDBC cursor fetching), UNION ALL
queries with JDBC tables produced incorrect results due to
duplicate table scans. The scheduler assigned each JDBC scan node to a
different executor based on node_id, causing UNPARTITIONED union
fragments to spawn instances on multiple executors. Each instance would
execute all scan nodes in the union, but only one had assigned ranges,
leading to 3x result duplication in queries.
The fix changes JDBC executor selection to hash fragment_idx instead of
node_id, ensuring all JDBC scans within a union fragment are co-located
on the same executor. This preserves the shared connection optimization
while preventing duplicate reads.
Additionally, fixed a race condition in JdbcRecordIterator where multiple
threads could call ResultSet.next() after end-of-stream. Per JDBC spec,
this behavior is vendor-specific for TYPE_FORWARD_ONLY result sets and
may throw SQLException. Added an endOfStream flag guarded by fetchLock
to prevent post-EOS next() calls.
Testing:
- Core TPC-DS JDBC tests now pass
- All tests passed with exhaustive exploration strategy using release build
- Added JDBC UNION ALL planner tests
Change-Id: I60ed011faa2177af67ea681c2cd2967648e4a963
---
M be/src/exec/data-source-scan-node.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M
fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/jdbc-parallel.test
7 files changed, 212 insertions(+), 46 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/24464/2
--
To view, visit http://gerrit.cloudera.org:8080/24464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I60ed011faa2177af67ea681c2cd2967648e4a963
Gerrit-Change-Number: 24464
Gerrit-PatchSet: 2
Gerrit-Owner: Arnab Karmakar <[email protected]>
Gerrit-Reviewer: Arnab Karmakar <[email protected]>
Gerrit-Reviewer: Balazs Hevele <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>