> - But its "set" field points to a RelSet with "rels" that _don't_ have > _any_ collation traits.
That’s OK. A “subset” (RelSubset) is a collection of RelNodes that are logically and physically equivalent (same results, same physical properties) whereas a “set” (RelSet) is a collection of RelNodes that are logically equivalent. A set can therefore be considered to be a collection of subsets, each of which contains RelNodes. And it used to be implemented that way, but in https://issues.apache.org/jira/browse/CALCITE-88 <https://issues.apache.org/jira/browse/CALCITE-88> we introduced collation as a trait, and that made subsets non-disjoint (a RelNode can be sorted on (x, y), and also on (x), and also on (), and also on (z)) so we made RelSubset just a view onto a RelSet, filtering the list of RelNodes according to the ones that have (“subsume”) the desired traits. Julian > On Dec 14, 2016, at 10:45 AM, Gian Merlino <[email protected]> wrote: > > I spent some more time looking into (3) and found that when I had things > going through the Planner rather than the JDBC driver, SortRemoveRule was > removing sorts when it shouldn't have been. This happens even for simple > queries like "SELECT dim1 FROM s.foo GROUP BY dim1 ORDER BY dim1 > DESC". Removing SortRemoveRule from the planner fixed the broken tests on > my end. > > I dug into that a bit and saw that the call to "convert(sort.getInput(), > traits)" in SortRemoveRule was returning a RelSubset that looked a bit > funny in the debugger: > > - The RelSubset's "traitSet" _does_ have the proper collation trait. > - But its "set" field points to a RelSet with "rels" that _don't_ have > _any_ collation traits. > > From what I understand that causes Calcite to treat the unsorted and sorted > rels as equivalent when they in fact aren't. I'm still not sure if this is > a Calcite bug or user error on my part… I'll keep looking into it unless > someone has any bright ideas. > > fwiw, my Planner construction looks like this: > > final FrameworkConfig frameworkConfig = Frameworks > .newConfigBuilder() > .parserConfig( > SqlParser.configBuilder() > .setCaseSensitive(true) > .setUnquotedCasing(Casing.UNCHANGED) > .build() > ) > .defaultSchema(rootSchema) > .traitDefs(ConventionTraitDef.INSTANCE, > RelCollationTraitDef.INSTANCE) > .programs(Programs.ofRules(myRules)) > .executor(new RexExecutorImpl(Schemas.createDataContext(null))) > .context(Contexts.EMPTY_CONTEXT) > .build(); > > return Frameworks.getPlanner(frameworkConfig); > > Gian > > On Sat, Dec 3, 2016 at 5:53 PM, Gian Merlino <[email protected]> wrote: > >> Sure, I added those first two to the ticket. >> >> I don't think those are happening with (3) but I'll double check next time >> I take a look at using the Planner. >> >> Gian >> >> On Fri, Dec 2, 2016 at 12:20 PM, Julian Hyde <[email protected]> wrote: >> >>> Can you please add (1) and (2) to https://issues.apache.org/jira >>> /browse/CALCITE-1525 <https://issues.apache.org/jira/browse/CALCITE-1525>, >>> which deals with the whole issue of using “Planner” within the JDBC driver, >>> so we can be consistent. >>> >>> (3) doesn’t look likely to be related. Do your queries have UNION or >>> other set-ops? Are you sorting on columns that do not appear in the final >>> result? >>> >>> Julian >>> >>> >>>> On Nov 28, 2016, at 10:45 AM, Gian Merlino <[email protected]> wrote: >>>> >>>> I traveled a bit down the Frameworks/Planner road and got most of my >>> tests >>>> passing, but ran into some problems getting them all to work: >>>> >>>> (1) "EXPLAIN PLAN FOR" throws NullPointerException during >>> Planner.validate. >>>> It looks like CalcitePrepareImpl has some special code to handle >>> validation >>>> of EXPLAIN, but PlannerImpl doesn't. I'm not sure if this is something I >>>> should be doing on my end, or if it's a bug in PlannerImpl. >>>> (2) I don't see a way to do ?-style prepared statements with bound >>>> variables, which _is_ possible with the JDBC driver route. >>>> (3) Not sure why this is happening, but for some reason ORDER BY / LIMIT >>>> clauses are getting ignored sometimes, even when they work with the JDBC >>>> driver route. This may be something messed up with my rules though and >>> may >>>> not be Calcite's fault. >>>> >>>> Julian, do any of these look like bugs that should be raised in jira, or >>>> are they just stuff I should be dealing with on my side? >>>> >>>> Btw, I do like that the Frameworks/Planner route gives me back the >>> RelNode >>>> itself, since that means I can make the Druid queries directly without >>>> needing to go through the extra layers of the JDBC driver. That part is >>>> nice. >>>> >>>> Gian >>>> >>>> On Wed, Nov 23, 2016 at 10:11 PM, Julian Hyde <[email protected]> wrote: >>>> >>>>> I don’t know how it’s used outside Calcite. Maybe some others can >>> chime in. >>>>> >>>>> Thanks for the PR. I logged https://issues.apache.org/jira >>>>> /browse/CALCITE-1509 <https://issues.apache.org/jir >>> a/browse/CALCITE-1509> >>>>> for it, and will commit shortly. >>>>> >>>>> Julian >>>>> >>>>>> On Nov 23, 2016, at 12:32 PM, Gian Merlino <[email protected]> wrote: >>>>>> >>>>>> Do you know examples of projects that use Planner or PlannerImpl >>>>> currently >>>>>> (from "outside")? As far as I can tell, within Calcite itself it's >>> only >>>>>> used in test code. Maybe that'd be a better entry point. >>>>>> >>>>>> In the meantime I raised a PR here for allowing a convertlet table >>>>> override >>>>>> in a CalcitePrepareImpl: https://github.com/apache/calcite/pull/330. >>>>> That >>>>>> was enough to get the JDBC driver on my end to behave how I want it >>> to. >>>>>> >>>>>> Gian >>>>>> >>>>>> On Thu, Nov 17, 2016 at 5:23 PM, Julian Hyde <[email protected]> >>> wrote: >>>>>> >>>>>>> I was wrong earlier… FrameworkConfig already has a getConvertletTable >>>>>>> method. But regarding using FrameworkConfig from within the JDBC >>> driver, >>>>>>> It’s complicated. FrameworkConfig only works if you are “outside” >>>>> Calcite, >>>>>>> whereas CalcitePrepare is when you are customizing from the inside, >>> and >>>>>>> sadly CalcitePrepare does not use a FrameworkConfig. >>>>>>> >>>>>>> Compare and contrast: >>>>>>> * CalcitePrepareImpl.getSqlToRelConverter [ >>> https://github.com/apache/ >>>>>>> calcite/blob/3f92157d5742dd10f3b828d22d7a753e0a2899cc/core/ >>>>> src/main/java/ >>>>>>> org/apache/calcite/prepare/CalcitePrepareImpl.java#L1114 < >>>>>>> https://github.com/apache/calcite/blob/3f92157d5742dd10f3b82 >>> 8d22d7a75 >>>>>>> 3e0a2899cc/core/src/main/java/org/apache/calcite/prepare/ >>>>>>> CalcitePrepareImpl.java#L1114> ] >>>>>>> * PlannerImpl.rel [ https://github.com/apache/calcite/blob/ >>>>>>> 105bba1f83cd9631e8e1211d262e4886a4a863b7/core/src/main/java/ >>>>>>> org/apache/calcite/prepare/PlannerImpl.java#L225 < >>>>>>> https://github.com/apache/calcite/blob/105bba1f83cd9631e8e12 >>> 11d262e48 >>>>>>> 86a4a863b7/core/src/main/java/org/apache/calcite/prepare/ >>>>>>> PlannerImpl.java#L225> ] >>>>>>> >>>>>>> The latter uses a convertletTable sourced from a FrameworkConfig. >>>>>>> >>>>>>> The ideal thing would be to get CalcitePrepareImpl to use a >>> PlannerImpl >>>>> to >>>>>>> do its dirty work. Then “inside” and “outside” would work the same. >>>>> Would >>>>>>> definitely appreciate that as a patch. >>>>>>> >>>>>>> If you choose to go the JDBC driver route, you could override >>>>>>> Driver.createPrepareFactory to produce a sub-class of CalcitePrepare >>>>> that >>>>>>> works for your environment, one with an explicit convertletTable >>> rather >>>>>>> than just using the default. >>>>>>> >>>>>>> Julian >>>>>>> >>>>>>> >>>>>>>> On Nov 17, 2016, at 5:01 PM, Gian Merlino <[email protected]> wrote: >>>>>>>> >>>>>>>> Hey Julian, >>>>>>>> >>>>>>>> If the convertlets were customizable with a FrameworkConfig, how >>> would >>>>> I >>>>>>>> use that configure the JDBC driver (given that I'm doing it with the >>>>> code >>>>>>>> upthread)? Or would that suggest using a different approach to >>>>> embedding >>>>>>>> Calcite? >>>>>>>> >>>>>>>> Gian >>>>>>>> >>>>>>>> On Thu, Nov 17, 2016 at 4:02 PM, Julian Hyde <[email protected]> >>> wrote: >>>>>>>> >>>>>>>>> Convertlets have a similar effect to planner rules (albeit they >>> act on >>>>>>>>> scalar expressions, not relational expressions) so people should be >>>>>>> able to >>>>>>>>> change the set of active convertlets. >>>>>>>>> >>>>>>>>> Would you like to propose a change that makes the convertlet table >>>>>>>>> pluggable? Maybe as part of FrameworkConfig? Regardless, please >>> log a >>>>>>> JIRA >>>>>>>>> to track this. >>>>>>>>> >>>>>>>>> And by the way, RexImpTable, which defines how operators are >>>>> implemented >>>>>>>>> by generating java code, should also be pluggable. It’s been on my >>>>> mind >>>>>>> for >>>>>>>>> a long time to allow the “engine” — related to the data format, and >>>>> how >>>>>>>>> code is generated to access fields and evaluate expressions and >>>>>>> operators — >>>>>>>>> to be pluggable. >>>>>>>>> >>>>>>>>> Regarding whether the JDBC driver is the right way to embed >>> Calcite. >>>>>>>>> There’s no easy answer. You might want to embed Calcite as a >>> library >>>>> in >>>>>>>>> your own server (as Drill and Hive do). Or you might want to make >>>>>>> yourself >>>>>>>>> just an adapter that runs inside a Calcite JDBC server (as the CSV >>>>>>> adapter >>>>>>>>> does). Or something in the middle, like what Phoenix does: using >>>>> Calcite >>>>>>>>> for JDBC, SQL, planning, but with your own metadata and runtime >>>>> engine. >>>>>>>>> >>>>>>>>> As long as you build the valuable stuff into planner rules, new >>>>>>> relational >>>>>>>>> operators (if necessary) and use the schema SPI, you should be >>> able to >>>>>>>>> change packaging in the future. >>>>>>>>> >>>>>>>>> Julian >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Nov 17, 2016, at 1:59 PM, Gian Merlino <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> Hey Calcites, >>>>>>>>>> >>>>>>>>>> I'm working on embedding Calcite into Druid (http://druid.io/, >>>>>>>>>> https://github.com/druid-io/druid/pull/3682) and am running into >>> a >>>>>>>>> problem >>>>>>>>>> that is making me wonder if the approach I'm using makes sense. >>>>>>>>>> >>>>>>>>>> Consider the expression EXTRACT(YEAR FROM __time). Calcite has a >>>>>>> standard >>>>>>>>>> convertlet rule "convertExtract" that changes this into some >>>>> arithmetic >>>>>>>>> on >>>>>>>>>> __time casted to an int type. But Druid has some builtin >>> functions to >>>>>>> do >>>>>>>>>> this, and I'd rather use those than arithmetic (for a bunch of >>>>>>> reasons). >>>>>>>>>> Ideally, in my RelOptRules that convert Calcite rels to Druid >>>>> queries, >>>>>>>>> I'd >>>>>>>>>> see the EXTRACT as a normal RexCall with the time flag and an >>>>>>> expression >>>>>>>>> to >>>>>>>>>> apply it to. That's a lot easier to translate than the arithmetic >>>>>>> stuff, >>>>>>>>>> which I'd have to pattern match and undo first before translating. >>>>>>>>>> >>>>>>>>>> So the problem I have is that I want to disable convertExtract, >>> but I >>>>>>>>> don't >>>>>>>>>> see a way to do that or to swap out the convertlet table. >>>>>>>>>> >>>>>>>>>> The code I'm using to set up a connection is: >>>>>>>>>> >>>>>>>>>> public CalciteConnection createCalciteConnection( >>>>>>>>>> final DruidSchema druidSchema >>>>>>>>>> ) throws SQLException >>>>>>>>>> { >>>>>>>>>> final Properties props = new Properties(); >>>>>>>>>> props.setProperty("caseSensitive", "true"); >>>>>>>>>> props.setProperty("unquotedCasing", "UNCHANGED"); >>>>>>>>>> final Connection connection = >>>>>>>>>> DriverManager.getConnection("jdbc:calcite:", props); >>>>>>>>>> final CalciteConnection calciteConnection = >>>>>>>>>> connection.unwrap(CalciteConnection.class); >>>>>>>>>> calciteConnection.getRootSchema().setCacheEnabled(false); >>>>>>>>>> calciteConnection.getRootSchema().add(DRUID_SCHEMA_NAME, >>>>>>>>> druidSchema); >>>>>>>>>> return calciteConnection; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> This CalciteConnection is then used by the Druid HTTP server to >>>>> offer a >>>>>>>>> SQL >>>>>>>>>> API. >>>>>>>>>> >>>>>>>>>> Is there some way to swap out the convertlet table that I'm >>> missing? >>>>>>>>>> >>>>>>>>>> Also, just in general, am I going about this the right way? Is >>> using >>>>>>> the >>>>>>>>>> JDBC driver the right way to embed Calcite? Or should I be calling >>>>> into >>>>>>>>> it >>>>>>>>>> at some lower level? >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>>> Gian >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> >>
