Abhishek Ravi created DRILL-6949: ------------------------------------ Summary: Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further" when Semi join is enabled Key: DRILL-6949 URL: https://issues.apache.org/jira/browse/DRILL-6949 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.15.0 Reporter: Abhishek Ravi
Following query fails when with *Error: UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further (probably due to too many join-key duplicates)* on TPC-H SF100 data. {code:sql} set `exec.hashjoin.enable.runtime_filter` = true; set `exec.hashjoin.runtime_filter.max.waiting.time` = 10000; set `planner.enable_broadcast_join` = false; select count(*) from lineitem l1 where l1.l_discount IN ( select distinct(cast(l2.l_discount as double)) from lineitem l2); reset `exec.hashjoin.enable.runtime_filter`; reset `exec.hashjoin.runtime_filter.max.waiting.time`; reset `planner.enable_broadcast_join`; {code} The subquery contains *distinct* keyword and hence there should not be duplicate values. I suspect that the failure is caused by semijoin because the query succeeds when semijoin is disabled explicitly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)