Re: Questions about the multiplier in cost computation for Druid Query

Julian Hyde Fri, 30 Jun 2017 18:29:50 -0700

Is it true that we'd always want to push the limit down to Druid,
regardless of whether the limit is large or small? If so, and if this
is not happening, there is a bug in the cost model; please log it.


On Fri, Jun 30, 2017 at 1:56 PM, Junxian Wu
<[email protected]> wrote:
> Hi Dev Community,
> In the middle of the usage of Calcite on Druid, I tried to run some simple 
> query like "select * from table_name limit 100". Although the SELECT query in 
> druid is super inefficient but by limiting the output row, it can still 
> return the result fast.
> However now the cost computation could make the planner tend to handle the 
> LIMIT RelNode (fetch() in sort node) in the memory of Calcite JVM. In my 
> case, When the number of LIMIT is larger than 7, the limit will not get 
> pushed in and since the total amount of data is large and will make the JVM 
> out of memory. By changing the multiplier of cost when sort node get pushed 
> in to a small number, a larger LIMIT can be pushed in. This logic does not 
> seem correct because when the LIMIT is fetching more row, we should tend to 
> handle it in database(Druid) side instead of in memory. Should we redesign 
> the cost computation so it will have the correct logic?
> Thank you.

Re: Questions about the multiplier in cost computation for Druid Query

Reply via email to