[ 
https://issues.apache.org/jira/browse/CALCITE-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18062581#comment-18062581
 ] 

Julian Hyde commented on CALCITE-7420:
--------------------------------------

All 4 of these points represent decisions that I had to take; I have sympathy 
for [~zabetak]'s position, but came down on the other side.

Regarding 1, test names. They already exist, and they convey a lot of 
information and structure. It is a small ask for the person creating a new test 
case to generate a unique, descriptive name. 

Test names are a very useful way to tie the new organization of tests back to 
the old. (Jira case numbers are optional and not unique.) That relationship has 
already proven useful, because Claude has not always faithfully copied 
before/after plans.

Regarding 2. I think that query execution results make tests more concrete to a 
human (see [literate 
programming|https://en.wikipedia.org/wiki/Literate_programming]). Yes, this not 
a SQL execution test, and yes, the rule being tested is not necessarily even 
used in producing the plan. Including output encourages authors to choose a 
lucid, compact example to illustrate the rule... essentially creating a 
tutorial.

The {{!ok}} command has already discovered a couple of bugs (e.g. CALCITE-7427 
from 
{{{}testSortRemoveConstantKeyWhenOrderByIsNull{}}}). And it is highlighting 
differences between the mock schema and the real scott schema, which, in the 
interests of simplification, we should eliminate.

Regarding 3. I mostly agree, except that some tests have more than one rule, 
and some rules have very few tests (and are therefore tested in groups, see 
e.g. {{{}reduce-expressions.iq{}}}).

Regarding 4. As the mantra goes, "As simple as possible, but no simpler". 
Remember that this test was an API-based test, not a SQL test. API-based 
testing lets you twiddle many, many knobs. If we had not converted many of 
those knobs to flags for the {{!sub-plan}} command – twelve at last count – we 
would not have been able to migrate many of the Java test methods to SQL. After 
the PR lands, you're welcome to try to eliminate the flags, one at a time.

Lastly, please remember that this is *not* a SQL-based end-to-end test. That is 
not what we need, and it is not what it should become. This is, and should 
remain, a *unit test* for planner rules. Moving is to SQL/Quidem has some 
significant benefits, but it is remarkable that it can be converted to SQL at 
all.

> Convert RelOptRulesTest to Quidem scripts
> -----------------------------------------
>
>                 Key: CALCITE-7420
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7420
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Assignee: Julian Hyde
>            Priority: Major
>              Labels: pull-request-available
>
> Many test cases in RelOptRulesTest are simple - involve just a SQL query, set 
> of rules, before and after plan. If these were converted to Quidem scripts 
> they would be easier to write (because you are writing just one file), more 
> maintainable (because Quidem scripts merge more easily than XML), and could 
> potentially be run from another language (which would enable ports of Calcite 
> to language such as Rust or Python).
> Here is how one test would look:
> {noformat}
> # Pull aggregate through union ====================
> # Tests that AggregateUnionAggregateRule can convert
> # a union of two ’select distinct’ queries to a ’select distinct’
> # of a union.
> !set rules "AGGREGATE_UNION_AGGREGATE"
> select deptno, job
> from (select deptno, job
>  from emp as e1
>  group by deptno, job
>  union all
>  select deptno, job
>  from emp as e2
>   group by deptno, job)
> group by deptno, job;
> +--------+-----------+
> | DEPTNO |    JOB    |
> +--------+-----------+
> | 20     | CLERK     |
> | 30     | SALESMAN  |
> | 20     | MANAGER   |
> | 30     | MANAGER   |
> | 10     | MANAGER   |
> | 20     | ANALYST   |
> | 10     | PRESIDENT |
> | 30     | CLERK     |
> | 10     | CLERK     |
> +--------+-----------+
> (9 rows)
> !ok
> LogicalAggregate(group=[{0, 1}])
>  LogicalUnion(all=[true])
>    LogicalAggregate(group=[{0, 1}])
>      LogicalProject(DEPTNO=[$7], JOB=[$2])
>        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>    LogicalAggregate(group=[{0, 1}])
>      LogicalProject(DEPTNO=[$7], JOB=[$2])
>        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> !plan-before
> LogicalAggregate(group=[{0, 1}])
>  LogicalUnion(all=[true])
>    LogicalProject(DEPTNO=[$7], JOB=[$2])
>      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>    LogicalProject(DEPTNO=[$7], JOB=[$2])
>      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> !plan-after
> {noformat}
>  
> It has the same information as the Java method
> {code:java}
>   @Test void testPullAggregateThroughUnion() {
>     final String sql = "select deptno, job from"
>         + " (select deptno, job from emp as e1"
>         + " group by deptno,job"
>         + "  union all"
>         + " select deptno, job from emp as e2"
>         + " group by deptno,job)"
>         + " group by deptno,job";
>     sql(sql)
>         .withRule(CoreRules.AGGREGATE_UNION_AGGREGATE)
>         .check();
>   }
> {code}
> and XML resources
> {code:xml}
>   <TestCase name="testPullAggregateThroughUnion">
>     <Resource name="sql">
>       <![CDATA[select deptno, job from (select deptno, job from emp as e1 
> group by deptno,job  union all select deptno, job from emp as e2 group by 
> deptno,job) group by deptno,job]]>
>     </Resource>
>     <Resource name="planBefore">
>       <![CDATA[
> LogicalAggregate(group=[{0, 1}])
>   LogicalUnion(all=[true])
>     LogicalAggregate(group=[{0, 1}])
>       LogicalProject(DEPTNO=[$7], JOB=[$2])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>     LogicalAggregate(group=[{0, 1}])
>       LogicalProject(DEPTNO=[$7], JOB=[$2])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> ]]>
>     </Resource>
> {code}
> It uses {{!set rules}} command (added in CALCITE-6335) and would require new 
> {{!plan-before}} and {{!plan-after}} commands.
> I suspect that Claude would do a quite good job migrating tests from .java 
> and .xml to .iq. I think it could also group tests by subject area (e.g. join 
> and aggregate) so that .iq files are ~1,000 lines.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to