Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/24464 )
Change subject: IMPALA-15076: Fix JDBC scan duplication in UNION ALL and ResultSet race ...................................................................... IMPALA-15076: Fix JDBC scan duplication in UNION ALL and ResultSet race After IMPALA-14523 patch (shared JDBC cursor fetching), UNION ALL queries with JDBC tables produced incorrect results due to duplicate table scans. The scheduler assigned each JDBC scan node to a different executor based on node_id, causing UNPARTITIONED union fragments to spawn instances on multiple executors. Each instance would execute all scan nodes in the union, but only one had assigned ranges, leading to 3x result duplication in queries. The fix changes JDBC executor selection to hash fragment_idx instead of node_id, ensuring all JDBC scans within a union fragment are co-located on the same executor. This preserves the shared connection optimization while preventing duplicate reads. Additionally, fixed a race condition in JdbcRecordIterator where multiple threads could call ResultSet.next() after end-of-stream. Per JDBC spec, this behavior is vendor-specific for TYPE_FORWARD_ONLY result sets and may throw SQLException. Added an endOfStream flag guarded by fetchLock to prevent post-EOS next() calls. Testing: - Core TPC-DS JDBC tests now pass - All tests passed with exhaustive exploration strategy using release build - Added JDBC UNION ALL planner tests Change-Id: I60ed011faa2177af67ea681c2cd2967648e4a963 Reviewed-on: http://gerrit.cloudera.org:8080/24464 Reviewed-by: Impala Public Jenkins <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M be/src/exec/data-source-scan-node.cc M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/jdbc-parallel.test 7 files changed, 212 insertions(+), 46 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/24464 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I60ed011faa2177af67ea681c2cd2967648e4a963 Gerrit-Change-Number: 24464 Gerrit-PatchSet: 4 Gerrit-Owner: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Balazs Hevele <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
