Julian, I think it’s good idea to always push down the limit. But regarding to Jungian’s case, there’s a simpler fix. This is a bug in the computing of cost that I have already reported: https://issues.apache.org/jira/browse/CALCITE-1842 The simple fix is to switch the two parameters. That will solve the problem and push the limit down.
-JD > On Jun 30, 2017, at 6:29 PM, Julian Hyde <[email protected]> wrote: > > Is it true that we'd always want to push the limit down to Druid, > regardless of whether the limit is large or small? If so, and if this > is not happening, there is a bug in the cost model; please log it. > > On Fri, Jun 30, 2017 at 1:56 PM, Junxian Wu > <[email protected]> wrote: >> Hi Dev Community, >> In the middle of the usage of Calcite on Druid, I tried to run some simple >> query like "select * from table_name limit 100". Although the SELECT query >> in druid is super inefficient but by limiting the output row, it can still >> return the result fast. >> However now the cost computation could make the planner tend to handle the >> LIMIT RelNode (fetch() in sort node) in the memory of Calcite JVM. In my >> case, When the number of LIMIT is larger than 7, the limit will not get >> pushed in and since the total amount of data is large and will make the JVM >> out of memory. By changing the multiplier of cost when sort node get pushed >> in to a small number, a larger LIMIT can be pushed in. This logic does not >> seem correct because when the LIMIT is fetching more row, we should tend to >> handle it in database(Druid) side instead of in memory. Should we redesign >> the cost computation so it will have the correct logic? >> Thank you.
