My feeling is that either temp table or putting 100k values into a separate parquet files makes more sense than putting 100k values in a IN list. Although for such long IN list Drill planner will convert into a JOIN (which is same as temp table / parquet table solutions), there is a big difference in terms of what the query plan looks like. An IN list with 100k values has to be serialized / de-serialized before the plan can be executed. I guess that would create a huge serialized plan, and is not the best solution one may use.
Also, putting 100k values in IN list may not be very typical. RDBMS probably impose certain limits on # of values in IN list. For instance, Oracle set the limit to 1000 [1]. 1. http://docs.oracle.com/database/122/SQLRF/Expression-Lists.htm#SQLRF52099 On Mon, May 15, 2017 at 7:11 PM, <jasbir.s...@accenture.com> wrote: > Hi, > > I am stuck in a problem where instance of apache drill stops working. My > topic of discussion will be - > > For a scenario, I have 25 parquet file with around 400K-500K records with > around 10 columns. My select query is such that for one column in clause > values are around 100K. When I run these queries parallelly, instance of > apache drill hangs and then gets shut down. Therefore, how to design the > select queries that apache supports these queries. > One of the solution that we are trying is - > a- Create temp table of 100K values and then use this as an inner query. But > as I know we can't make temp table at run time from Java code. It needs some > data source either parquet or some other source to create temp table. > b - Create a separate parquet file of all 100K values and use inner query > instead of all the values directly in the main query. > > Is there any better way to go around this problem or can we just solve this > problem with simple configuration changes ? > > Regards, > Jasbir Singh > > > -----Original Message----- > From: Jinfeng Ni [mailto:j...@apache.org] > Sent: Tuesday, May 16, 2017 2:29 AM > To: dev <dev@drill.apache.org>; user <u...@drill.apache.org> > Subject: [DRILL HANGOUT] Topics for 5/16/2017 > > Hi All, > > Out bi-weekly Drill hangout is tomorrow (5/16/2017, 10AM PDT). Please respond > with suggestion of topics for discussion. We will also collect topics at the > beginning of handout tomorrow. > > Thanks, > > Jinfeng > > ________________________________ > > This message is for the designated recipient only and may contain privileged, > proprietary, or otherwise confidential information. If you have received it > in error, please notify the sender immediately and delete the original. Any > other use of the e-mail by you is prohibited. Where allowed by local law, > electronic communications with Accenture and its affiliates, including e-mail > and instant messaging (including content), may be scanned by our systems for > the purposes of information security and assessment of internal compliance > with Accenture policy. > ______________________________________________________________________________________ > > www.accenture.com