Hello, This JIRA (SPARK-16951) already being closed with the resolution of "Won't Fix" on 23/Feb/17.
But in TPC-H test, we met performance issue of Q16, which used NOT IN subquery and being translated into broadcast nested loop join. This query uses almost half time of total 22 queries. For example, 512GB data set, totally execution time is 1400 seconds, while Q16's execution time is 630 seconds. TPC-H is a common spark sql performance benchmark, this performance issue will be met usually. Is it possible to reopen this JIRA and fix this issue? Thanks, Linna