[ 
https://issues.apache.org/jira/browse/CALCITE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378927#comment-17378927
 ] 

ZiLin Chen commented on CALCITE-4682:
-------------------------------------

How about total cost? The concept of start up cost and total cost could be 
found in PostgreSQL.

> Cost of limit operator may breaks total cost assumption
> -------------------------------------------------------
>
>                 Key: CALCITE-4682
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4682
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: ZiLin Chen
>            Priority: Major
>
> Total cost assumption means each operator wound run to completion, so the 
> actual running cost of a plan will be equal to the cumulative cost. However, 
> cost of limit operator may breaks this assumption.
> Any way calcite can provide to solve the problem about cost of limit operator?
> Limit operator can affect how the join below to choose specific algorithm by 
> start up cost instead of total cost.
> However volcano planner can only deal with the total cost, which works well 
> in OLAP system, but OLTP system need some kind of cost model to solve this 
> problem.
>  
> Given a sql like: select * from A join B on A.id = B.id limit 1.
> Limit
>  - Join 
>   - TableScan A  (10,000,000 rows)
>   - TableScan B (10,000,000 rows)
>  
> Consider two join algorithms HashJoin and IndexNestedLoopJoin.
> It is more efficient to use IndexNestedLoopJoin instead of HashJoin, because 
> of we just need to fetch some rows from A then index look up B. when the 
> number of join output reach limit fetch size, it is over. However using hash 
> join need to build a hash table with 10,000,000 rows. 
>  
> Now coming to VolcanoPlanner, the best cost of this sql will be computed as 
> cost of limit operator plus cumulative cost of join. If we only consider 
> cumulative cost, then HashJoin wound likely be chosen for two large tables 
> join. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to