What you call “statistics” Calcite calls “metadata”. Calcite has a 
comprehensive system for adding a new kind of metadata (such as histograms) or 
a new provider for metadata (that would, say, compute a value of the 
Selectivity metadata for YourFilter and YourJoin).

The Table.getStatistic() method is a very simple way to inject some very simple 
metadata, but it does not (and is not intended to) scale to richer metadata.

Take a look at BuiltInMetadata, RelMetadataQuery, and one of the built-in 
providers, say RelMdSelectivity.

Note that it is OK to define your own metadata types outside of 
BuiltInMetadata. RelMetadataTest.ColType illustrates that this is possible.

Other groups (Hive, Drill) are probably interested in a “Histogram” metadata 
type, and it would be great if we could all use the same definition of 
Histogram, but I suspect it would take several months for that discussion to 
converge on anything concrete. If you’re in a hurry, better to forge ahead and 
share what you come up with.

Julian



> On Feb 24, 2016, at 6:02 AM, Victor Giannakouris - Salalidis 
> <[email protected]> wrote:
> 
> Hello,
> 
> I am using HepPlanner with custom table classes for the catalog (extending
> *AbstractTable*). In my implementation I override the getStatistic() method
> in which I return a Statistic definition in which I override the
> getRowCount() method.
> 
> I added some rules to the planner in order to optimize join ordering. At
> this step, it moves for example the smaller tables (such as those in which
> a filter is applied) at the left (*build side*).
> 
> My actual question is how (where) can I add my own statistics (concretely,
> *histograms* for selectivity estimation) in order to perform estimates for
> filters or join intermediate results.
> -- 
> Victor Giannakouris - Salalidis
> 
> LinkedIn:
> http://gr.linkedin.com/pub/victor-giannakouris-salalidis/69/585/b23/
> Personal Page: http://gsvic.github.io

Reply via email to