This was totally user error, I was ignoring "fields" and "collation" of the RelRoot. Passing "collation" to planner.transform stopped it from removing the sort.
Gian On Wed, Dec 14, 2016 at 3:21 PM, Gian Merlino <[email protected]> wrote: > I don't think so, since my test query (SELECT dim1 FROM s.foo GROUP BY > dim1 ORDER BY dim1 DESC) is sorting on a column that is also projected. > > Gian > > On Wed, Dec 14, 2016 at 11:58 AM, Julian Hyde <[email protected]> wrote: > >> Are you running into some variant of the problems that inspired >> https://issues.apache.org/jira/browse/CALCITE-819: < >> https://issues.apache.org/jira/browse/CALCITE-819:> at the root of the >> tree, columns that are not projected are removed, and if the desired sort >> order involves non-projected columns, the desired sort order is forgotten). >> >> > On Dec 14, 2016, at 11:19 AM, Gian Merlino <[email protected]> wrote: >> > >> > Ah, thanks. So if that sort of thing is not a smoking gun, do you have >> an >> > idea about where I should look next? If not I'll keep poking around. >> > >> > Gian >> > >> > On Wed, Dec 14, 2016 at 11:06 AM, Julian Hyde <[email protected]> wrote: >> > >> >>> - But its "set" field points to a RelSet with "rels" that _don't_ have >> >>> _any_ collation traits. >> >> >> >> That’s OK. A “subset” (RelSubset) is a collection of RelNodes that are >> >> logically and physically equivalent (same results, same physical >> >> properties) whereas a “set” (RelSet) is a collection of RelNodes that >> are >> >> logically equivalent. >> >> >> >> A set can therefore be considered to be a collection of subsets, each >> of >> >> which contains RelNodes. And it used to be implemented that way, but in >> >> https://issues.apache.org/jira/browse/CALCITE-88 < >> >> https://issues.apache.org/jira/browse/CALCITE-88> we introduced >> collation >> >> as a trait, and that made subsets non-disjoint (a RelNode can be >> sorted on >> >> (x, y), and also on (x), and also on (), and also on (z)) so we made >> >> RelSubset just a view onto a RelSet, filtering the list of RelNodes >> >> according to the ones that have (“subsume”) the desired traits. >> >> >> >> Julian >> >> >> >> >> >>> On Dec 14, 2016, at 10:45 AM, Gian Merlino <[email protected]> wrote: >> >>> >> >>> I spent some more time looking into (3) and found that when I had >> things >> >>> going through the Planner rather than the JDBC driver, SortRemoveRule >> was >> >>> removing sorts when it shouldn't have been. This happens even for >> simple >> >>> queries like "SELECT dim1 FROM s.foo GROUP BY dim1 ORDER BY dim1 >> >>> DESC". Removing SortRemoveRule from the planner fixed the broken >> tests on >> >>> my end. >> >>> >> >>> I dug into that a bit and saw that the call to >> "convert(sort.getInput(), >> >>> traits)" in SortRemoveRule was returning a RelSubset that looked a bit >> >>> funny in the debugger: >> >>> >> >>> - The RelSubset's "traitSet" _does_ have the proper collation trait. >> >>> - But its "set" field points to a RelSet with "rels" that _don't_ have >> >>> _any_ collation traits. >> >>> >> >>> From what I understand that causes Calcite to treat the unsorted and >> >> sorted >> >>> rels as equivalent when they in fact aren't. I'm still not sure if >> this >> >> is >> >>> a Calcite bug or user error on my part… I'll keep looking into it >> unless >> >>> someone has any bright ideas. >> >>> >> >>> fwiw, my Planner construction looks like this: >> >>> >> >>> final FrameworkConfig frameworkConfig = Frameworks >> >>> .newConfigBuilder() >> >>> .parserConfig( >> >>> SqlParser.configBuilder() >> >>> .setCaseSensitive(true) >> >>> .setUnquotedCasing(Casing.UNCHANGED) >> >>> .build() >> >>> ) >> >>> .defaultSchema(rootSchema) >> >>> .traitDefs(ConventionTraitDef.INSTANCE, >> >>> RelCollationTraitDef.INSTANCE) >> >>> .programs(Programs.ofRules(myRules)) >> >>> .executor(new RexExecutorImpl(Schemas.createDataContext(null))) >> >>> .context(Contexts.EMPTY_CONTEXT) >> >>> .build(); >> >>> >> >>> return Frameworks.getPlanner(frameworkConfig); >> >>> >> >>> Gian >> >>> >> >>> On Sat, Dec 3, 2016 at 5:53 PM, Gian Merlino <[email protected]> wrote: >> >>> >> >>>> Sure, I added those first two to the ticket. >> >>>> >> >>>> I don't think those are happening with (3) but I'll double check next >> >> time >> >>>> I take a look at using the Planner. >> >>>> >> >>>> Gian >> >>>> >> >>>> On Fri, Dec 2, 2016 at 12:20 PM, Julian Hyde <[email protected]> >> wrote: >> >>>> >> >>>>> Can you please add (1) and (2) to https://issues.apache.org/jira >> >>>>> /browse/CALCITE-1525 <https://issues.apache.org/jir >> >> a/browse/CALCITE-1525>, >> >>>>> which deals with the whole issue of using “Planner” within the JDBC >> >> driver, >> >>>>> so we can be consistent. >> >>>>> >> >>>>> (3) doesn’t look likely to be related. Do your queries have UNION or >> >>>>> other set-ops? Are you sorting on columns that do not appear in the >> >> final >> >>>>> result? >> >>>>> >> >>>>> Julian >> >>>>> >> >>>>> >> >>>>>> On Nov 28, 2016, at 10:45 AM, Gian Merlino <[email protected]> wrote: >> >>>>>> >> >>>>>> I traveled a bit down the Frameworks/Planner road and got most of >> my >> >>>>> tests >> >>>>>> passing, but ran into some problems getting them all to work: >> >>>>>> >> >>>>>> (1) "EXPLAIN PLAN FOR" throws NullPointerException during >> >>>>> Planner.validate. >> >>>>>> It looks like CalcitePrepareImpl has some special code to handle >> >>>>> validation >> >>>>>> of EXPLAIN, but PlannerImpl doesn't. I'm not sure if this is >> >> something I >> >>>>>> should be doing on my end, or if it's a bug in PlannerImpl. >> >>>>>> (2) I don't see a way to do ?-style prepared statements with bound >> >>>>>> variables, which _is_ possible with the JDBC driver route. >> >>>>>> (3) Not sure why this is happening, but for some reason ORDER BY / >> >> LIMIT >> >>>>>> clauses are getting ignored sometimes, even when they work with the >> >> JDBC >> >>>>>> driver route. This may be something messed up with my rules though >> and >> >>>>> may >> >>>>>> not be Calcite's fault. >> >>>>>> >> >>>>>> Julian, do any of these look like bugs that should be raised in >> jira, >> >> or >> >>>>>> are they just stuff I should be dealing with on my side? >> >>>>>> >> >>>>>> Btw, I do like that the Frameworks/Planner route gives me back the >> >>>>> RelNode >> >>>>>> itself, since that means I can make the Druid queries directly >> without >> >>>>>> needing to go through the extra layers of the JDBC driver. That >> part >> >> is >> >>>>>> nice. >> >>>>>> >> >>>>>> Gian >> >>>>>> >> >>>>>> On Wed, Nov 23, 2016 at 10:11 PM, Julian Hyde <[email protected]> >> >> wrote: >> >>>>>> >> >>>>>>> I don’t know how it’s used outside Calcite. Maybe some others can >> >>>>> chime in. >> >>>>>>> >> >>>>>>> Thanks for the PR. I logged https://issues.apache.org/jira >> >>>>>>> /browse/CALCITE-1509 <https://issues.apache.org/jir >> >>>>> a/browse/CALCITE-1509> >> >>>>>>> for it, and will commit shortly. >> >>>>>>> >> >>>>>>> Julian >> >>>>>>> >> >>>>>>>> On Nov 23, 2016, at 12:32 PM, Gian Merlino <[email protected]> >> wrote: >> >>>>>>>> >> >>>>>>>> Do you know examples of projects that use Planner or PlannerImpl >> >>>>>>> currently >> >>>>>>>> (from "outside")? As far as I can tell, within Calcite itself >> it's >> >>>>> only >> >>>>>>>> used in test code. Maybe that'd be a better entry point. >> >>>>>>>> >> >>>>>>>> In the meantime I raised a PR here for allowing a convertlet >> table >> >>>>>>> override >> >>>>>>>> in a CalcitePrepareImpl: https://github.com/apache/calc >> ite/pull/330 >> >> . >> >>>>>>> That >> >>>>>>>> was enough to get the JDBC driver on my end to behave how I want >> it >> >>>>> to. >> >>>>>>>> >> >>>>>>>> Gian >> >>>>>>>> >> >>>>>>>> On Thu, Nov 17, 2016 at 5:23 PM, Julian Hyde <[email protected]> >> >>>>> wrote: >> >>>>>>>> >> >>>>>>>>> I was wrong earlier… FrameworkConfig already has a >> >> getConvertletTable >> >>>>>>>>> method. But regarding using FrameworkConfig from within the JDBC >> >>>>> driver, >> >>>>>>>>> It’s complicated. FrameworkConfig only works if you are >> “outside” >> >>>>>>> Calcite, >> >>>>>>>>> whereas CalcitePrepare is when you are customizing from the >> inside, >> >>>>> and >> >>>>>>>>> sadly CalcitePrepare does not use a FrameworkConfig. >> >>>>>>>>> >> >>>>>>>>> Compare and contrast: >> >>>>>>>>> * CalcitePrepareImpl.getSqlToRelConverter [ >> >>>>> https://github.com/apache/ >> >>>>>>>>> calcite/blob/3f92157d5742dd10f3b828d22d7a753e0a2899cc/core/ >> >>>>>>> src/main/java/ >> >>>>>>>>> org/apache/calcite/prepare/CalcitePrepareImpl.java#L1114 < >> >>>>>>>>> https://github.com/apache/calcite/blob/3f92157d5742dd10f3b82 >> >>>>> 8d22d7a75 >> >>>>>>>>> 3e0a2899cc/core/src/main/java/org/apache/calcite/prepare/ >> >>>>>>>>> CalcitePrepareImpl.java#L1114> ] >> >>>>>>>>> * PlannerImpl.rel [ https://github.com/apache/calcite/blob/ >> >>>>>>>>> 105bba1f83cd9631e8e1211d262e4886a4a863b7/core/src/main/java/ >> >>>>>>>>> org/apache/calcite/prepare/PlannerImpl.java#L225 < >> >>>>>>>>> https://github.com/apache/calcite/blob/105bba1f83cd9631e8e12 >> >>>>> 11d262e48 >> >>>>>>>>> 86a4a863b7/core/src/main/java/org/apache/calcite/prepare/ >> >>>>>>>>> PlannerImpl.java#L225> ] >> >>>>>>>>> >> >>>>>>>>> The latter uses a convertletTable sourced from a >> FrameworkConfig. >> >>>>>>>>> >> >>>>>>>>> The ideal thing would be to get CalcitePrepareImpl to use a >> >>>>> PlannerImpl >> >>>>>>> to >> >>>>>>>>> do its dirty work. Then “inside” and “outside” would work the >> same. >> >>>>>>> Would >> >>>>>>>>> definitely appreciate that as a patch. >> >>>>>>>>> >> >>>>>>>>> If you choose to go the JDBC driver route, you could override >> >>>>>>>>> Driver.createPrepareFactory to produce a sub-class of >> >> CalcitePrepare >> >>>>>>> that >> >>>>>>>>> works for your environment, one with an explicit convertletTable >> >>>>> rather >> >>>>>>>>> than just using the default. >> >>>>>>>>> >> >>>>>>>>> Julian >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>>> On Nov 17, 2016, at 5:01 PM, Gian Merlino <[email protected]> >> wrote: >> >>>>>>>>>> >> >>>>>>>>>> Hey Julian, >> >>>>>>>>>> >> >>>>>>>>>> If the convertlets were customizable with a FrameworkConfig, >> how >> >>>>> would >> >>>>>>> I >> >>>>>>>>>> use that configure the JDBC driver (given that I'm doing it >> with >> >> the >> >>>>>>> code >> >>>>>>>>>> upthread)? Or would that suggest using a different approach to >> >>>>>>> embedding >> >>>>>>>>>> Calcite? >> >>>>>>>>>> >> >>>>>>>>>> Gian >> >>>>>>>>>> >> >>>>>>>>>> On Thu, Nov 17, 2016 at 4:02 PM, Julian Hyde <[email protected] >> > >> >>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>>> Convertlets have a similar effect to planner rules (albeit >> they >> >>>>> act on >> >>>>>>>>>>> scalar expressions, not relational expressions) so people >> should >> >> be >> >>>>>>>>> able to >> >>>>>>>>>>> change the set of active convertlets. >> >>>>>>>>>>> >> >>>>>>>>>>> Would you like to propose a change that makes the convertlet >> >> table >> >>>>>>>>>>> pluggable? Maybe as part of FrameworkConfig? Regardless, >> please >> >>>>> log a >> >>>>>>>>> JIRA >> >>>>>>>>>>> to track this. >> >>>>>>>>>>> >> >>>>>>>>>>> And by the way, RexImpTable, which defines how operators are >> >>>>>>> implemented >> >>>>>>>>>>> by generating java code, should also be pluggable. It’s been >> on >> >> my >> >>>>>>> mind >> >>>>>>>>> for >> >>>>>>>>>>> a long time to allow the “engine” — related to the data >> format, >> >> and >> >>>>>>> how >> >>>>>>>>>>> code is generated to access fields and evaluate expressions >> and >> >>>>>>>>> operators — >> >>>>>>>>>>> to be pluggable. >> >>>>>>>>>>> >> >>>>>>>>>>> Regarding whether the JDBC driver is the right way to embed >> >>>>> Calcite. >> >>>>>>>>>>> There’s no easy answer. You might want to embed Calcite as a >> >>>>> library >> >>>>>>> in >> >>>>>>>>>>> your own server (as Drill and Hive do). Or you might want to >> make >> >>>>>>>>> yourself >> >>>>>>>>>>> just an adapter that runs inside a Calcite JDBC server (as the >> >> CSV >> >>>>>>>>> adapter >> >>>>>>>>>>> does). Or something in the middle, like what Phoenix does: >> using >> >>>>>>> Calcite >> >>>>>>>>>>> for JDBC, SQL, planning, but with your own metadata and >> runtime >> >>>>>>> engine. >> >>>>>>>>>>> >> >>>>>>>>>>> As long as you build the valuable stuff into planner rules, >> new >> >>>>>>>>> relational >> >>>>>>>>>>> operators (if necessary) and use the schema SPI, you should be >> >>>>> able to >> >>>>>>>>>>> change packaging in the future. >> >>>>>>>>>>> >> >>>>>>>>>>> Julian >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>>> On Nov 17, 2016, at 1:59 PM, Gian Merlino <[email protected]> >> >> wrote: >> >>>>>>>>>>>> >> >>>>>>>>>>>> Hey Calcites, >> >>>>>>>>>>>> >> >>>>>>>>>>>> I'm working on embedding Calcite into Druid ( >> http://druid.io/, >> >>>>>>>>>>>> https://github.com/druid-io/druid/pull/3682) and am running >> >> into >> >>>>> a >> >>>>>>>>>>> problem >> >>>>>>>>>>>> that is making me wonder if the approach I'm using makes >> sense. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Consider the expression EXTRACT(YEAR FROM __time). Calcite >> has a >> >>>>>>>>> standard >> >>>>>>>>>>>> convertlet rule "convertExtract" that changes this into some >> >>>>>>> arithmetic >> >>>>>>>>>>> on >> >>>>>>>>>>>> __time casted to an int type. But Druid has some builtin >> >>>>> functions to >> >>>>>>>>> do >> >>>>>>>>>>>> this, and I'd rather use those than arithmetic (for a bunch >> of >> >>>>>>>>> reasons). >> >>>>>>>>>>>> Ideally, in my RelOptRules that convert Calcite rels to Druid >> >>>>>>> queries, >> >>>>>>>>>>> I'd >> >>>>>>>>>>>> see the EXTRACT as a normal RexCall with the time flag and an >> >>>>>>>>> expression >> >>>>>>>>>>> to >> >>>>>>>>>>>> apply it to. That's a lot easier to translate than the >> >> arithmetic >> >>>>>>>>> stuff, >> >>>>>>>>>>>> which I'd have to pattern match and undo first before >> >> translating. >> >>>>>>>>>>>> >> >>>>>>>>>>>> So the problem I have is that I want to disable >> convertExtract, >> >>>>> but I >> >>>>>>>>>>> don't >> >>>>>>>>>>>> see a way to do that or to swap out the convertlet table. >> >>>>>>>>>>>> >> >>>>>>>>>>>> The code I'm using to set up a connection is: >> >>>>>>>>>>>> >> >>>>>>>>>>>> public CalciteConnection createCalciteConnection( >> >>>>>>>>>>>> final DruidSchema druidSchema >> >>>>>>>>>>>> ) throws SQLException >> >>>>>>>>>>>> { >> >>>>>>>>>>>> final Properties props = new Properties(); >> >>>>>>>>>>>> props.setProperty("caseSensitive", "true"); >> >>>>>>>>>>>> props.setProperty("unquotedCasing", "UNCHANGED"); >> >>>>>>>>>>>> final Connection connection = >> >>>>>>>>>>>> DriverManager.getConnection("jdbc:calcite:", props); >> >>>>>>>>>>>> final CalciteConnection calciteConnection = >> >>>>>>>>>>>> connection.unwrap(CalciteConnection.class); >> >>>>>>>>>>>> calciteConnection.getRootSchema().setCacheEnabled(false); >> >>>>>>>>>>>> calciteConnection.getRootSchema().add(DRUID_SCHEMA_NAME, >> >>>>>>>>>>> druidSchema); >> >>>>>>>>>>>> return calciteConnection; >> >>>>>>>>>>>> } >> >>>>>>>>>>>> >> >>>>>>>>>>>> This CalciteConnection is then used by the Druid HTTP server >> to >> >>>>>>> offer a >> >>>>>>>>>>> SQL >> >>>>>>>>>>>> API. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Is there some way to swap out the convertlet table that I'm >> >>>>> missing? >> >>>>>>>>>>>> >> >>>>>>>>>>>> Also, just in general, am I going about this the right way? >> Is >> >>>>> using >> >>>>>>>>> the >> >>>>>>>>>>>> JDBC driver the right way to embed Calcite? Or should I be >> >> calling >> >>>>>>> into >> >>>>>>>>>>> it >> >>>>>>>>>>>> at some lower level? >> >>>>>>>>>>>> >> >>>>>>>>>>>> Thanks! >> >>>>>>>>>>>> >> >>>>>>>>>>>> Gian >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>> >> >>>>> >> >>>> >> >> >> >> >> >> >
