Hive implementation might be a good reference.

Chin Wei
On 25 Jul 2015 06:30, "Julian Hyde" <[email protected]> wrote:

> You should know that RelNode methods such as getRows, computeSelfCost are
> the “quick and dirty” option. If you sophisticated statistics, write a
> metadata provider. See RelMetadataQuery, RowCount, NonCumulativeCost.
>
> I intend to revisit and evolve these interfaces in 1.5 time frame. See
> https://issues.apache.org/jira/browse/CALCITE-794 and
> https://issues.apache.org/jira/browse/CALCITE-604. 794 will require some
> non-backwards-compatible interface changes but shouldn’t be too painful.
>
> Julian
>
>
> > On Jul 24, 2015, at 1:56 PM, Vladimir Sitnikov <
> [email protected]> wrote:
> >
> > Costing model indeed begs improvements.
> > However, to improve it some real-life cases are required.
> >
> > It happens so that even current formulas work (Julian-only-knows-how)
> > for current unittest/etc cases.
> >
> > This is not completely new topic, no-one just came up with test cases
> and fixes.
> >
> >> To set cost I overrode computeSelfCost method where cpu cost and io cost
> > was multiplied by length of row. But it didn't help.
> >
> > Your next battle would probably be "costing of computation pushdown".
> > In other words, costing of computations in java vs costing of the same
> > operations performed in your storage.
> > I've noticed that Calcite prefers "to just full scan the data sources"
> > and join/aggregate in java, however your mileage may vary.
> >
> > Vladimir
>
>

Reply via email to