I already wrote some ideas in https://issues.apache.org/jira/browse/CALCITE-1876. Let's discuss there.
On Tue, Jul 11, 2017 at 1:24 PM, Luis Fernando Kauer <[email protected]> wrote: > Hi, > > If I change CsvTranslatableTable so that it implements > ProjectableFilterableTable instead of TranslatableTable and implement the > scan method, Calcite's own rules apply and the plan gets right, scanning only > the used field in the aggregate function. > > However, now I realized that > > "select count(*) from EMPS" generates the plan: > EnumerableAggregate(group=[{}], EXPR$0=[COUNT()]) > CsvTableScan(table=[[SALES, EMPS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) > > "select * from EMPS" generates the plan: > CsvTableScan(table=[[SALES, EMPS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) > > Notice that the count(*) generates a plan that scans all fields, requiring to > convert them all without the need. > Even when using ProjectableFilterableTable plan scans all fields, but the > plan for "select count(name) from EMPS" scans just one field. > What could be the best approach to handle the count(*) without having to scan > all fields? > > Best regards, > > Luis Fernando > > > > > > Em Quinta-feira, 6 de Julho de 2017 18:05, Julian Hyde <[email protected]> > escreveu: > > > > Calcite should realize that Aggregate has an implied Project (because it only > uses a few columns) and push that projection into the CsvTableScan, but it > doesn’t. > > I think we need a new rule for Aggregate on a TableScan of a > ProjectableFilterableTable. Can you create a JIRA case please? > > I created a test case. It currently fails: > > diff --git a/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java > b/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java > index 00c59ee..2402872 100644 > --- a/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java > +++ b/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java > @@ -241,6 +241,13 @@ public Void apply(ResultSet resultSet) { > .ok(); > } > > + @Test public void testAggregateImpliesProject() throws SQLException { > + final String sql = "select max(name) from EMPS"; > + final String plan = "PLAN=EnumerableAggregate(group=[{}], > EXPR$0=[MAX($0)])\n" > + + " CsvTableScan(table=[[SALES, EMPS]], fields=[[1]])\n"; > + sql("smart", "explain plan for " + sql).returns(plan).ok(); > + } > + > @Test public void testFilterableSelect() throws SQLException { > sql("filterable-model", "select name from EMPS").ok(); > } > > > Julian > > >> On Jul 6, 2017, at 1:23 PM, Luis Fernando Kauer >> <[email protected]> wrote: >> >> Hi, >> I'm trying to understand the CSV Adapter and how the rules are fired.The >> CsvProjectTableScanRule gets fired when I use CsvTranslatableTable.But I'm >> not understanding why I'm getting a plan that scans all fields when I use an >> aggregate function.For example:explain plan for select name from >> emps;CsvTableScan(table=[[SALES, EMPS]], fields=[[1]]) >> >> explain plan for select max(name) from emps;EnumerableAggregate(group=[{}], >> EXPR$0=[MAX($1)])CsvTableScan(table=[[SALES, EMPS]], fields=[[0, 1, 2, 3, 4, >> 5, 6, 7, 8, 9]]) >> I noticed that the rule gets fired and at that point it shows just 1 field >> being used.But the last time CsvTableScan.deriveRowType() gets called it has >> all the fields set, and it's not the instance create by the rule, but the >> first instance created with all the fields. >> Can anybody explain me if this is a bug or if this is supposed to happen >> with aggregate functions ? >> Best regards, >> Luis Fernando Kauer
