OK, contributions welcome.
> On Jul 2, 2017, at 9:25 AM, JD Zheng <[email protected]> wrote:
>
> Julian, I think it’s good idea to always push down the limit. But regarding
> to Jungian’s case, there’s a simpler fix. This is a bug in the computing of
> cost that I have already reported:
> https://issues.apache.org/jira/browse/CALCITE-1842
> The simple fix is to switch the two parameters. That will solve the problem
> and push the limit down.
>
> -JD
>
>> On Jun 30, 2017, at 6:29 PM, Julian Hyde <[email protected]> wrote:
>>
>> Is it true that we'd always want to push the limit down to Druid,
>> regardless of whether the limit is large or small? If so, and if this
>> is not happening, there is a bug in the cost model; please log it.
>>
>> On Fri, Jun 30, 2017 at 1:56 PM, Junxian Wu
>> <[email protected]> wrote:
>>> Hi Dev Community,
>>> In the middle of the usage of Calcite on Druid, I tried to run some simple
>>> query like "select * from table_name limit 100". Although the SELECT query
>>> in druid is super inefficient but by limiting the output row, it can still
>>> return the result fast.
>>> However now the cost computation could make the planner tend to handle the
>>> LIMIT RelNode (fetch() in sort node) in the memory of Calcite JVM. In my
>>> case, When the number of LIMIT is larger than 7, the limit will not get
>>> pushed in and since the total amount of data is large and will make the JVM
>>> out of memory. By changing the multiplier of cost when sort node get pushed
>>> in to a small number, a larger LIMIT can be pushed in. This logic does not
>>> seem correct because when the LIMIT is fetching more row, we should tend to
>>> handle it in database(Druid) side instead of in memory. Should we redesign
>>> the cost computation so it will have the correct logic?
>>> Thank you.
>