Hive implementation might be a good reference. Chin Wei On 25 Jul 2015 06:30, "Julian Hyde" <[email protected]> wrote:
> You should know that RelNode methods such as getRows, computeSelfCost are > the “quick and dirty” option. If you sophisticated statistics, write a > metadata provider. See RelMetadataQuery, RowCount, NonCumulativeCost. > > I intend to revisit and evolve these interfaces in 1.5 time frame. See > https://issues.apache.org/jira/browse/CALCITE-794 and > https://issues.apache.org/jira/browse/CALCITE-604. 794 will require some > non-backwards-compatible interface changes but shouldn’t be too painful. > > Julian > > > > On Jul 24, 2015, at 1:56 PM, Vladimir Sitnikov < > [email protected]> wrote: > > > > Costing model indeed begs improvements. > > However, to improve it some real-life cases are required. > > > > It happens so that even current formulas work (Julian-only-knows-how) > > for current unittest/etc cases. > > > > This is not completely new topic, no-one just came up with test cases > and fixes. > > > >> To set cost I overrode computeSelfCost method where cpu cost and io cost > > was multiplied by length of row. But it didn't help. > > > > Your next battle would probably be "costing of computation pushdown". > > In other words, costing of computations in java vs costing of the same > > operations performed in your storage. > > I've noticed that Calcite prefers "to just full scan the data sources" > > and join/aggregate in java, however your mileage may vary. > > > > Vladimir > >
