Hi Alireza, I suppose you mainly need count estimates for base tables. One way to provide this information is by implementing the Table interface [1] and providing an appropriate implementation for the getStatistic() method. Another way would be to extend TableScan [2] operator and provide another implementation for computeSelfCost() method. If you really need to pass your own provider you may find some information in the RelMetadataTest [3].
Best, Stamatis [1] https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/schema/Table.java [2] https://github.com/apache/calcite/blob/2765791e60c46e0d66a3c510a5c91d16fe757720/core/src/main/java/org/apache/calcite/rel/core/TableScan.java#L87 [3] https://github.com/apache/calcite/blob/2765791e60c46e0d66a3c510a5c91d16fe757720/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L1006 On Mon, May 20, 2019 at 6:55 PM Alireza Samadian <[email protected]> wrote: > Hi, > > I'm working on Beam SQL and we are using Apache Calcite for our query > parsing and optimization. We are trying to use row count estimates for > Volcano Optimizer. Currently, Calcite returns row count estimate of 100 for > every table. > It seems the cost estimate comes from bunch of handlers in > org.apache.calcite.rel.metadata.RelMetadataQuery; however, apparently the > handlers are automatically generated and I cannot figure out how I can pass > my own handler for cost estimation. Also, I am not even sure if this is the > standard way of passing my own cost estimates. > I found this thread in StackOverflow: > > https://stackoverflow.com/questions/54726015/why-does-apache-calcite-estimates-100-rows-for-all-tables-a-query-contains/54739313#54739313 > > One of the replies suggests that the only way to inject cardinality > estimates for tables is via an ExternalCatalog. I cannot find any > information about ExternalCatalog and I am not sure even what it is. > > I will appreciate if someone guides me or send me some links or examples. > > Best, > Alireza Samadian >
