Thanks for the pointer, Xiao. I found that leftanti join type is no longer in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala
FYI On Fri, Dec 4, 2015 at 12:04 PM, Xiao Li <gatorsm...@gmail.com> wrote: > https://github.com/apache/spark/pull/9055 > > This JIRA explains how to convert IN to Joins. > > Thanks, > > Xiao Li > > > > 2015-12-04 11:27 GMT-08:00 Michael Armbrust <mich...@databricks.com>: > >> The best way to run this today is probably to manually convert the query >> into a join. I.e. create a dataframe that has all the numbers in it, and >> join/outer join it with the other table. This way you avoid parsing a >> gigantic string. >> >> On Fri, Dec 4, 2015 at 10:36 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> >>> Have you seen this JIRA ? >>> >>> [SPARK-8077] [SQL] Optimization for TreeNodes with large numbers of >>> children >>> >>> From the numbers Michael published, 1 million numbers would still need >>> 250 seconds to parse. >>> >>> On Fri, Dec 4, 2015 at 10:14 AM, Madabhattula Rajesh Kumar < >>> mrajaf...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> How to use/best practices "IN" clause in Spark SQL. >>>> >>>> Use Case :- Read the table based on number. I have a List of numbers. >>>> For example, 1million. >>>> >>>> Regards, >>>> Rajesh >>>> >>> >>> >> >