There's is not a *tool* for conversion per-se, but org.apache.calcite.rel.rel2sql.RelToSqlConverter provides an easy interface which takes a RelNode (Calcite's description of a plan) as input and produces SQL as output.
The schema for the test cases is defined programmatically in the init method of core/src/test/java/org/apache/calcite/test/MockCatalogReader.java. The API should be fairly easy to follow. No other rules involving constraints immediately to mind but perhaps one of the veteran Calcite committers will have some suggestions. -- Michael Mior [email protected] 2017-09-21 3:50 GMT-04:00 Shumo Chu <[email protected]>: > Thanks, Michael! > > We would like to try using Cosette to check the Calcite test cases in your > link. Can you send us a pointer to the tool that converts Calcite query > plans to SQL? Also, where is the schema for the test cases? Our plan is to > call the tool to generate SQL from query plans, and then pass that along > with the original SQL to Cosette to check. What do you think? > > In terms of the Cosette language document, you can find it here > <http://cosette.cs.washington.edu/guide>. There is a section explaining > the > SQL features that we currently support. We will go through the rules in the > test cases and see whether there are other features that we need to > implement in Cosette. > > I have another question for you. We are extending Cosette to check rewrite > rules in presence of constraints, such as keys and foreign keys. We found > one such rule in Calcite's design document for materialized views: > https://github.com/apache/calcite/blob/master/site/_docs/ > materialized_views.md. Would you know of other such cases in Calcite? > > On Tue, Sep 19, 2017 at 10:08 AM, Michael Mior <[email protected]> wrote: > > > Alvin, > > > > Sounds like an interesting proposal. While I'm certainly not as familiar > > with the Calcite code base as Julian, I will chime in a little here. One > > thing that may prove helpful is > > core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml in > the > > Calcite source tree. This contains plans before and after different rules > > have been applied. > > > > Although these are in the form of logical plans and not SQL, we have > > existing machinery to convert plans back to SQL text. It seems like one > > approach would be to mine the existing tests to get a pairs of SQL > strings > > for each existing rule being tested. Not sure this would be the best > > approach long term, but it seems like a good way to get started. I'm > happy > > to try to hack together a script that could give this kind of output. > > > > I suspect part of this would include implementing what Alvin termed as a > > Cosette interpretation of SQL (which Calcite refers to as a "dialect"). > If > > this sounds interesting to others, I'll go ahead and see what I can come > up > > with. > > > > What's the best reference for the specific format of SQL accepted by > > Cosette? > > > > -- > > Michael Mior > > [email protected] > > > > 2017-09-19 11:59 GMT-04:00 Alvin Cheung <[email protected]>: > > > > > Hi Julian et al, > > > > > > Thanks for your interest in Cosette. Your suggestions make a lot of > > sense. > > > We have done some initial work and would like to get your feedback on > how > > > to integrate the two tools together. > > > > > > > One obvious idea is to use Cosette to audit Calcite’s query > > > transformation rules. Each rule is supposed to preserve semantics but > > > (until Cosette) we had to trust the author of the rule. We could > convert > > > the before and after relational expressions to SQL, and then ask > Cosette > > > whether those are equivalent. We could enable this check in Calcite’s > > test > > > suite, during which many thousands of rules are fired. > > > > > > Indeed. We have browsed through the Calcite rules and reformulated a > few > > > of them using our Cosette language: > > > > > > 1. Conjunctive select (https://github.com/apache/cal > > > cite/blob/master/core/src/main/java/org/apache/calcite/rel/ > > > rules/FilterMergeRule.java) --> https://demo.cosette.cs. > washington.edu/ > > > (click conjunctive select from the dropdown menu) > > > > > > 2. Join commute (https://github.com/apache/cal > > > cite/blob/master/core/src/main/java/org/apache/calcite/rel/ > > > rules/JoinCommuteRule.java) --> Join commute from the demo website > above > > > > > > 3. Join/Project transpose (https://github.com/apache/cal > > > cite/blob/master/core/src/main/java/org/apache/calcite/rel/ > > > rules/JoinProjectTransposeRule.java) --> Join Proj. Trans. from the > demo > > > website above > > > > > > As we are not very familiar with the Calcite code base, we have tried > our > > > best to guess the intention of each rule based on the documentation, > > please > > > feel free to point out if we made mistakes. > > > > > > As you can see, the Cosette language is pretty much like standard SQL, > > > except for declarations of schemas and relations. You will also notice > > the > > > "??" in some schema declarations (e.g., in rule 1. above) --- they > stand > > > for "symbolic" attributes that can represent any attribute. In other > > words, > > > if Cosette can prove that a rule with symbolic attributes is true, then > > it > > > will be true regardless of what the symbolic attributes are > instantiated > > > with. Symbolic predicates (e.g., in rule 1.) works similarly, hence > > giving > > > Cosette a mechanism to prove (or disprove) classes of rewrite rules. > See > > > our documentation at http://cosette.cs.washington.edu/guide for > details. > > > > > > I believe the challenge here is how we can "reverse engineer" the > > > intention of each of the existing rules so that they can be expressed > in > > > Cosette. Any suggestions on how to do this? We have a few students > > working > > > on Cosette and can help, but we will probably need help from Calcite to > > > fully understand all of the existing rules. Another possibility is to > > print > > > out the input and output of each rule application during testing, and > > send > > > them to Cosette. If the printout is in a form that resembles SQL we can > > > probably patch it into Cosette. > > > > > > For new rules, can we can ask Calcite authors to express them in > Cosette > > > as well, perhaps as part of the documentation? This way we will only > need > > > to handle the existing rules. > > > > > > > A few rules might use other information besides the input relational > > > expression, such as predicates that are known to hold or column > > > combinations that are known to be unique. But let’s see what happens. > > > > > > This is something that we are actively working on. Can you point us to > > > specific rules with such properties? One possibility is the join > > > commutativity rule noted above. You will notice that we didn't prove > the > > > "general form" of the rule with symbolic attributes, but rather one > with > > > concrete schemas. This is because Cosette currently implements the > > unnamed > > > approach to attribute naming (see Section 3.2 in > > > http://webdam.inria.fr/Alice/pdfs/Chapter-3.pdf), hence the general > form > > > of the rule is only true if we know that the two input schemas have > > > distinct attributes. > > > > > > > This is a very loose integration of Cosette / Calcite, but we can > make > > > closer integrations (e.g. within the same JVM, even at runtime) as we > > > discover synergies. After all, optimization and theorem-proving are > > related > > > endeavors. > > > > > > Agreed. Cosette is implemented using Coq and Racket. We realize that > > those > > > are not the most popular languages for implementing systems :) , so > > Cosette > > > comes with a POST API as well: http://cosette.cs.washington. > > edu/guide#api > > > . It takes in the program text written in Cosette, and returns the > answer > > > (or times out). Does this make it easier to run the tool? We are open > to > > > implementing other bindings as well. > > > > > > > Another area that would be useful would be to devise test data. > > > > > > How about this: Each SQL implementation has its own interpretation of > > SQL, > > > with Cosette being one of them. Let's implement different SQL semantics > > > using Cosette (say, Calcite's and Postgres'). Then, given a query, ask > > > Cosette to find a counterexample (i.e., an input relation) where the > two > > > implementations will return different results when executed on a given > > > query. If such a counterexample exists, then Calcite developers can > > > determine whether this is a "bug" or a "feature". Does this sound > similar > > > to what you have in mind? > > > > > > > There might be applications in materialized views. A query Q can use > a > > > materialized view V if V covers Q. In other words if Q == R(V) where R > is > > > some sequence of relational operators. Given Q and V, Cosette could > > perhaps > > > analyze and either return R (success) or return that V does not cover Q > > > (failure). > > > > > > This resembles the problem of deciding whether a given relation > > (expressed > > > as a query) is contained in another one. It will take some work for > > Cosette > > > to be able to handle this but it definitely sounds interesting. Do you > > have > > > an application in mind? One of them might be to determine whether > > > previously cached results can be used. > > > > > > We definitely see lots of synergies between the two tools. To start > with > > > something easy :) , I propose we first discuss how to use the current > > > Cosette implementation to audit existing Calcite rules, and a way to > > > integrate Cosette into development of future Calcite rules as part of > > code > > > review / regression tests. What do you think? > > > > > > Thanks, > > > Alvin (on behalf of the Cosette team) > > > > > > > > > -- > shumochu.com >
