As soon as you have something to share — even if it’s not ready to commit — I’d be happy to review.
> On Feb 28, 2016, at 11:05 PM, Victor Giannakouris - Salalidis > <[email protected]> wrote: > > Hi Julian, > > Thank you for the quick response. These days I am working on the classes > you mention at your reply. If I come up with some different implementation > I will share it for sure. > > Best, > > Victor > > On Wed, Feb 24, 2016 at 11:24 PM, Julian Hyde <[email protected]> wrote: > >> What you call “statistics” Calcite calls “metadata”. Calcite has a >> comprehensive system for adding a new kind of metadata (such as histograms) >> or a new provider for metadata (that would, say, compute a value of the >> Selectivity metadata for YourFilter and YourJoin). >> >> The Table.getStatistic() method is a very simple way to inject some very >> simple metadata, but it does not (and is not intended to) scale to richer >> metadata. >> >> Take a look at BuiltInMetadata, RelMetadataQuery, and one of the built-in >> providers, say RelMdSelectivity. >> >> Note that it is OK to define your own metadata types outside of >> BuiltInMetadata. RelMetadataTest.ColType illustrates that this is possible. >> >> Other groups (Hive, Drill) are probably interested in a “Histogram” >> metadata type, and it would be great if we could all use the same >> definition of Histogram, but I suspect it would take several months for >> that discussion to converge on anything concrete. If you’re in a hurry, >> better to forge ahead and share what you come up with. >> >> Julian >> >> >> >>> On Feb 24, 2016, at 6:02 AM, Victor Giannakouris - Salalidis < >> [email protected]> wrote: >>> >>> Hello, >>> >>> I am using HepPlanner with custom table classes for the catalog >> (extending >>> *AbstractTable*). In my implementation I override the getStatistic() >> method >>> in which I return a Statistic definition in which I override the >>> getRowCount() method. >>> >>> I added some rules to the planner in order to optimize join ordering. At >>> this step, it moves for example the smaller tables (such as those in >> which >>> a filter is applied) at the left (*build side*). >>> >>> My actual question is how (where) can I add my own statistics >> (concretely, >>> *histograms* for selectivity estimation) in order to perform estimates >> for >>> filters or join intermediate results. >>> -- >>> Victor Giannakouris - Salalidis >>> >>> LinkedIn: >>> http://gr.linkedin.com/pub/victor-giannakouris-salalidis/69/585/b23/ >>> Personal Page: http://gsvic.github.io >>
