Re: Questions about the multiplier in cost computation for Druid Query

Julian Hyde Sun, 02 Jul 2017 11:50:39 -0700
OK, contributions welcome.

> On Jul 2, 2017, at 9:25 AM, JD Zheng <[email protected]> wrote:
> 
> Julian, I think it’s good idea to always push down the limit. But regarding 
> to Jungian’s case, there’s a simpler fix. This is a bug in the computing of 
> cost that I have already reported: 
> https://issues.apache.org/jira/browse/CALCITE-1842
> The simple fix is to switch the two parameters. That will solve the problem 
> and push the limit down.
> 
> -JD
> 
>> On Jun 30, 2017, at 6:29 PM, Julian Hyde <[email protected]> wrote:
>> 
>> Is it true that we'd always want to push the limit down to Druid,
>> regardless of whether the limit is large or small? If so, and if this
>> is not happening, there is a bug in the cost model; please log it.
>> 
>> On Fri, Jun 30, 2017 at 1:56 PM, Junxian Wu
>> <[email protected]> wrote:
>>> Hi Dev Community,
>>> In the middle of the usage of Calcite on Druid, I tried to run some simple 
>>> query like "select * from table_name limit 100". Although the SELECT query 
>>> in druid is super inefficient but by limiting the output row, it can still 
>>> return the result fast.
>>> However now the cost computation could make the planner tend to handle the 
>>> LIMIT RelNode (fetch() in sort node) in the memory of Calcite JVM. In my 
>>> case, When the number of LIMIT is larger than 7, the limit will not get 
>>> pushed in and since the total amount of data is large and will make the JVM 
>>> out of memory. By changing the multiplier of cost when sort node get pushed 
>>> in to a small number, a larger LIMIT can be pushed in. This logic does not 
>>> seem correct because when the LIMIT is fetching more row, we should tend to 
>>> handle it in database(Druid) side instead of in memory. Should we redesign 
>>> the cost computation so it will have the correct logic?
>>> Thank you.
>
Re: Questions about the multiplier in cost computation for Druid Query

Reply via email to