Re: Question on Optiq CBO

Julian Hyde Mon, 11 Aug 2014 13:04:18 -0700

Yes. Optiq separates metadata (e.g. cardinality, i.e. how many rows a RelNode 
is expected to return) from cost. The default cost model is based on 
cardinality, but you can define other cost models.


You can see this in action in EnumerableJoinRel.computeSelfCost [1]. Even 
though joins commute (i.e. X join Y returns the same results as Y join X), an 
EnumerableJoinRel will be cheaper if the input with fewer rows is on the left. 
We chose a cost model to reflect that.


      // Cheaper if the smaller number of rows is coming from the LHS.
      // Model this by adding L log L to the cost.
      final double rightRowCount = right.getRows();
      final double leftRowCount = left.getRows();
      if (Double.isInfinite(leftRowCount)) {
        rowCount = leftRowCount;
      } else {
        rowCount += Util.nLogN(leftRowCount);
      }
      if (Double.isInfinite(rightRowCount)) {
        rowCount = rightRowCount;
      } else {
        rowCount += rightRowCount;
      }
      return planner.getCostFactory().makeCost(rowCount, 0, 0);

The simple way to modify the cost model is to modify (or override) 
RelNode.computeSelfCost. Metadata and cost are also pluggable [2].

Julian

[1] 
https://github.com/apache/incubator-optiq/blob/5d209a509b6e2ca895626105950906cad154fac9/core/src/main/java/net/hydromatic/optiq/rules/java/JavaRules.java#L167

[2] https://issues.apache.org/jira/browse/OPTIQ-362


On Aug 11, 2014, at 2:45 AM, Ravi Nallappan <[email protected]> wrote:

> Hi,
> 
> 
> 
> Is there a way in Optiq to promote an expression (based on our custom rule)
> as the best expression even if the two has same cost (in term of
> cardinality).
> 
> 
> 
> ------------
> 
> Sets:
> 
> Set#0, type: RecordType(JavaType(class java.lang.Integer) EMPNO,
> JavaType(class java.lang.String) NAME, JavaType(class java.lang.Integer)
> DEPTNO, JavaType(class java.lang.String) GENDER, JavaType(class
> java.lang.String) CITY, JavaType(class java.lang.Integer) EMPID,
> JavaType(class java.lang.Integer) AGE, JavaType(class java.lang.Boolean)
> SLACKER, JavaType(class java.lang.Boolean) MANAGER, JavaType(class
> java.lang.String) JOINEDAT)
> 
>                rel#4:Subset#0.ENUMERABLE.[], best=rel#0,
> importance=0.7290000000000001
> 
>                                rel#0:CsvTableScan.ENUMERABLE.[](table=[SALES,
> EMPS],fields=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), rowcount=100.0, cumulative
> cost={100.0 rows, 101.0 cpu, 0.0 io}
> 
> Set#1, type: RecordType()
> 
>                rel#6:Subset#1.NONE.[], best=null, importance=0.81
> 
> 
> rel#5:ProjectRel.NONE.[](child=rel#4:Subset#0.ENUMERABLE.[]),
> rowcount=100.0, cumulative cost={inf}
> 
>                rel#13:Subset#1.ENUMERABLE.[], best=rel#21, importance=0.9
> 
> 
> rel#14:AbstractConverter.ENUMERABLE.[](child=rel#6:Subset#1.NONE.[],convention=ENUMERABLE,sort=[]),
> rowcount=1.7976931348623157E308, cumulative cost={inf}
> 
> 
> rel#17:EnumerableProjectRel.ENUMERABLE.[](child=rel#4:Subset#0.ENUMERABLE.[]),
> rowcount=100.0, cumulative cost={200.0 rows, 101.0 cpu, 0.0 io}
> 
>                                rel#21:CsvTableScan.ENUMERABLE.[](table=[SALES,
> EMPS],fields=[]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu,
> 0.0 io}
> 
> Set#2, type: RecordType(INTEGER EXPR$0)
> 
>                rel#8:Subset#2.NONE.[], best=null, importance=0.9
> 
> 
> rel#7:ProjectRel.NONE.[](child=rel#6:Subset#1.NONE.[],EXPR$0=+(4,
> 2)), rowcount=1.7976931348623157E308, cumulative cost={inf}
> 
> 
> rel#18:ProjectRel.NONE.[](child=rel#4:Subset#0.ENUMERABLE.[],EXPR$0=+(4,
> 2)), rowcount=100.0, cumulative cost={inf}
> 
>                rel#11:Subset#2.ENUMERABLE.[], best=rel#19, importance=1.0
> 
> 
> rel#12:AbstractConverter.ENUMERABLE.[](child=rel#8:Subset#2.NONE.[],convention=ENUMERABLE,sort=[]),
> rowcount=1.7976931348623157E308, cumulative cost={inf}
> 
> 
> rel#15:EnumerableProjectRel.ENUMERABLE.[](child=rel#13:Subset#1.ENUMERABLE.[],EXPR$0=+(4,
> 2)), importance=0.0, rowcount=100.0, cumulative cost={200.0 rows, 201.0
> cpu, 0.0 io}
> 
> 
> rel#16:EnumerableProjectRel.ENUMERABLE.[](child=rel#13:Subset#1.ENUMERABLE.[],EXPR$0=6),
> rowcount=100.0, cumulative cost={200.0 rows, 201.0 cpu, 0.0 io}
> 
> 
> rel#19:EnumerableProjectRel.ENUMERABLE.[](child=rel#4:Subset#0.ENUMERABLE.[],EXPR$0=+(4,
> 2)), importance=0.0, rowcount=100.0, cumulative cost={200.0 rows, 201.0
> cpu, 0.0 io}
> 
> 
> rel#20:EnumerableProjectRel.ENUMERABLE.[](child=rel#4:Subset#0.ENUMERABLE.[],EXPR$0=6),
> rowcount=100.0, cumulative cost={200.0 rows, 201.0 cpu, 0.0 io} ß wanted
> this
> 
> -----
> 
> 
> 
> Feel free to ask me for more details that I missed out here.
> 
> 
> 
> ps: This is based on optiq vesion 0.5.
> 
> 
> 
> Thanks & Regards,
> 
> Ravi Nallappan

Re: Question on Optiq CBO

Reply via email to