You should know that RelNode methods such as getRows, computeSelfCost are the “quick and dirty” option. If you sophisticated statistics, write a metadata provider. See RelMetadataQuery, RowCount, NonCumulativeCost.
I intend to revisit and evolve these interfaces in 1.5 time frame. See https://issues.apache.org/jira/browse/CALCITE-794 and https://issues.apache.org/jira/browse/CALCITE-604. 794 will require some non-backwards-compatible interface changes but shouldn’t be too painful. Julian > On Jul 24, 2015, at 1:56 PM, Vladimir Sitnikov <[email protected]> > wrote: > > Costing model indeed begs improvements. > However, to improve it some real-life cases are required. > > It happens so that even current formulas work (Julian-only-knows-how) > for current unittest/etc cases. > > This is not completely new topic, no-one just came up with test cases and > fixes. > >> To set cost I overrode computeSelfCost method where cpu cost and io cost > was multiplied by length of row. But it didn't help. > > Your next battle would probably be "costing of computation pushdown". > In other words, costing of computations in java vs costing of the same > operations performed in your storage. > I've noticed that Calcite prefers "to just full scan the data sources" > and join/aggregate in java, however your mileage may vary. > > Vladimir
