Hi,
If I change CsvTranslatableTable so that it implements
ProjectableFilterableTable instead of TranslatableTable and implement the scan
method, Calcite's own rules apply and the plan gets right, scanning only the
used field in the aggregate function.
However, now I realized that
"select count(*) from EMPS" generates the plan:
EnumerableAggregate(group=[{}], EXPR$0=[COUNT()])
CsvTableScan(table=[[SALES, EMPS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
"select * from EMPS" generates the plan:
CsvTableScan(table=[[SALES, EMPS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
Notice that the count(*) generates a plan that scans all fields, requiring to
convert them all without the need.
Even when using ProjectableFilterableTable plan scans all fields, but the plan
for "select count(name) from EMPS" scans just one field.
What could be the best approach to handle the count(*) without having to scan
all fields?
Best regards,
Luis Fernando
Em Quinta-feira, 6 de Julho de 2017 18:05, Julian Hyde <[email protected]>
escreveu:
Calcite should realize that Aggregate has an implied Project (because it only
uses a few columns) and push that projection into the CsvTableScan, but it
doesn’t.
I think we need a new rule for Aggregate on a TableScan of a
ProjectableFilterableTable. Can you create a JIRA case please?
I created a test case. It currently fails:
diff --git a/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java
b/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java
index 00c59ee..2402872 100644
--- a/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java
+++ b/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java
@@ -241,6 +241,13 @@ public Void apply(ResultSet resultSet) {
.ok();
}
+ @Test public void testAggregateImpliesProject() throws SQLException {
+ final String sql = "select max(name) from EMPS";
+ final String plan = "PLAN=EnumerableAggregate(group=[{}],
EXPR$0=[MAX($0)])\n"
+ + " CsvTableScan(table=[[SALES, EMPS]], fields=[[1]])\n";
+ sql("smart", "explain plan for " + sql).returns(plan).ok();
+ }
+
@Test public void testFilterableSelect() throws SQLException {
sql("filterable-model", "select name from EMPS").ok();
}
Julian
> On Jul 6, 2017, at 1:23 PM, Luis Fernando Kauer
> <[email protected]> wrote:
>
> Hi,
> I'm trying to understand the CSV Adapter and how the rules are fired.The
> CsvProjectTableScanRule gets fired when I use CsvTranslatableTable.But I'm
> not understanding why I'm getting a plan that scans all fields when I use an
> aggregate function.For example:explain plan for select name from
> emps;CsvTableScan(table=[[SALES, EMPS]], fields=[[1]])
>
> explain plan for select max(name) from emps;EnumerableAggregate(group=[{}],
> EXPR$0=[MAX($1)])CsvTableScan(table=[[SALES, EMPS]], fields=[[0, 1, 2, 3, 4,
> 5, 6, 7, 8, 9]])
> I noticed that the rule gets fired and at that point it shows just 1 field
> being used.But the last time CsvTableScan.deriveRowType() gets called it has
> all the fields set, and it's not the instance create by the rule, but the
> first instance created with all the fields.
> Can anybody explain me if this is a bug or if this is supposed to happen with
> aggregate functions ?
> Best regards,
> Luis Fernando Kauer