Are you running into some variant of the problems that inspired
https://issues.apache.org/jira/browse/CALCITE-819:
<https://issues.apache.org/jira/browse/CALCITE-819:> at the root of the tree,
columns that are not projected are removed, and if the desired sort order
involves non-projected columns, the desired sort order is forgotten).
> On Dec 14, 2016, at 11:19 AM, Gian Merlino <g...@imply.io> wrote:
>
> Ah, thanks. So if that sort of thing is not a smoking gun, do you have an
> idea about where I should look next? If not I'll keep poking around.
>
> Gian
>
> On Wed, Dec 14, 2016 at 11:06 AM, Julian Hyde <jh...@apache.org> wrote:
>
>>> - But its "set" field points to a RelSet with "rels" that _don't_ have
>>> _any_ collation traits.
>>
>> That’s OK. A “subset” (RelSubset) is a collection of RelNodes that are
>> logically and physically equivalent (same results, same physical
>> properties) whereas a “set” (RelSet) is a collection of RelNodes that are
>> logically equivalent.
>>
>> A set can therefore be considered to be a collection of subsets, each of
>> which contains RelNodes. And it used to be implemented that way, but in
>> https://issues.apache.org/jira/browse/CALCITE-88 <
>> https://issues.apache.org/jira/browse/CALCITE-88> we introduced collation
>> as a trait, and that made subsets non-disjoint (a RelNode can be sorted on
>> (x, y), and also on (x), and also on (), and also on (z)) so we made
>> RelSubset just a view onto a RelSet, filtering the list of RelNodes
>> according to the ones that have (“subsume”) the desired traits.
>>
>> Julian
>>
>>
>>> On Dec 14, 2016, at 10:45 AM, Gian Merlino <g...@imply.io> wrote:
>>>
>>> I spent some more time looking into (3) and found that when I had things
>>> going through the Planner rather than the JDBC driver, SortRemoveRule was
>>> removing sorts when it shouldn't have been. This happens even for simple
>>> queries like "SELECT dim1 FROM s.foo GROUP BY dim1 ORDER BY dim1
>>> DESC". Removing SortRemoveRule from the planner fixed the broken tests on
>>> my end.
>>>
>>> I dug into that a bit and saw that the call to "convert(sort.getInput(),
>>> traits)" in SortRemoveRule was returning a RelSubset that looked a bit
>>> funny in the debugger:
>>>
>>> - The RelSubset's "traitSet" _does_ have the proper collation trait.
>>> - But its "set" field points to a RelSet with "rels" that _don't_ have
>>> _any_ collation traits.
>>>
>>> From what I understand that causes Calcite to treat the unsorted and
>> sorted
>>> rels as equivalent when they in fact aren't. I'm still not sure if this
>> is
>>> a Calcite bug or user error on my part… I'll keep looking into it unless
>>> someone has any bright ideas.
>>>
>>> fwiw, my Planner construction looks like this:
>>>
>>> final FrameworkConfig frameworkConfig = Frameworks
>>> .newConfigBuilder()
>>> .parserConfig(
>>> SqlParser.configBuilder()
>>> .setCaseSensitive(true)
>>> .setUnquotedCasing(Casing.UNCHANGED)
>>> .build()
>>> )
>>> .defaultSchema(rootSchema)
>>> .traitDefs(ConventionTraitDef.INSTANCE,
>>> RelCollationTraitDef.INSTANCE)
>>> .programs(Programs.ofRules(myRules))
>>> .executor(new RexExecutorImpl(Schemas.createDataContext(null)))
>>> .context(Contexts.EMPTY_CONTEXT)
>>> .build();
>>>
>>> return Frameworks.getPlanner(frameworkConfig);
>>>
>>> Gian
>>>
>>> On Sat, Dec 3, 2016 at 5:53 PM, Gian Merlino <g...@imply.io> wrote:
>>>
>>>> Sure, I added those first two to the ticket.
>>>>
>>>> I don't think those are happening with (3) but I'll double check next
>> time
>>>> I take a look at using the Planner.
>>>>
>>>> Gian
>>>>
>>>> On Fri, Dec 2, 2016 at 12:20 PM, Julian Hyde <jh...@apache.org> wrote:
>>>>
>>>>> Can you please add (1) and (2) to https://issues.apache.org/jira
>>>>> /browse/CALCITE-1525 <https://issues.apache.org/jir
>> a/browse/CALCITE-1525>,
>>>>> which deals with the whole issue of using “Planner” within the JDBC
>> driver,
>>>>> so we can be consistent.
>>>>>
>>>>> (3) doesn’t look likely to be related. Do your queries have UNION or
>>>>> other set-ops? Are you sorting on columns that do not appear in the
>> final
>>>>> result?
>>>>>
>>>>> Julian
>>>>>
>>>>>
>>>>>> On Nov 28, 2016, at 10:45 AM, Gian Merlino <g...@imply.io> wrote:
>>>>>>
>>>>>> I traveled a bit down the Frameworks/Planner road and got most of my
>>>>> tests
>>>>>> passing, but ran into some problems getting them all to work:
>>>>>>
>>>>>> (1) "EXPLAIN PLAN FOR" throws NullPointerException during
>>>>> Planner.validate.
>>>>>> It looks like CalcitePrepareImpl has some special code to handle
>>>>> validation
>>>>>> of EXPLAIN, but PlannerImpl doesn't. I'm not sure if this is
>> something I
>>>>>> should be doing on my end, or if it's a bug in PlannerImpl.
>>>>>> (2) I don't see a way to do ?-style prepared statements with bound
>>>>>> variables, which _is_ possible with the JDBC driver route.
>>>>>> (3) Not sure why this is happening, but for some reason ORDER BY /
>> LIMIT
>>>>>> clauses are getting ignored sometimes, even when they work with the
>> JDBC
>>>>>> driver route. This may be something messed up with my rules though and
>>>>> may
>>>>>> not be Calcite's fault.
>>>>>>
>>>>>> Julian, do any of these look like bugs that should be raised in jira,
>> or
>>>>>> are they just stuff I should be dealing with on my side?
>>>>>>
>>>>>> Btw, I do like that the Frameworks/Planner route gives me back the
>>>>> RelNode
>>>>>> itself, since that means I can make the Druid queries directly without
>>>>>> needing to go through the extra layers of the JDBC driver. That part
>> is
>>>>>> nice.
>>>>>>
>>>>>> Gian
>>>>>>
>>>>>> On Wed, Nov 23, 2016 at 10:11 PM, Julian Hyde <jh...@apache.org>
>> wrote:
>>>>>>
>>>>>>> I don’t know how it’s used outside Calcite. Maybe some others can
>>>>> chime in.
>>>>>>>
>>>>>>> Thanks for the PR. I logged https://issues.apache.org/jira
>>>>>>> /browse/CALCITE-1509 <https://issues.apache.org/jir
>>>>> a/browse/CALCITE-1509>
>>>>>>> for it, and will commit shortly.
>>>>>>>
>>>>>>> Julian
>>>>>>>
>>>>>>>> On Nov 23, 2016, at 12:32 PM, Gian Merlino <g...@imply.io> wrote:
>>>>>>>>
>>>>>>>> Do you know examples of projects that use Planner or PlannerImpl
>>>>>>> currently
>>>>>>>> (from "outside")? As far as I can tell, within Calcite itself it's
>>>>> only
>>>>>>>> used in test code. Maybe that'd be a better entry point.
>>>>>>>>
>>>>>>>> In the meantime I raised a PR here for allowing a convertlet table
>>>>>>> override
>>>>>>>> in a CalcitePrepareImpl: https://github.com/apache/calcite/pull/330
>> .
>>>>>>> That
>>>>>>>> was enough to get the JDBC driver on my end to behave how I want it
>>>>> to.
>>>>>>>>
>>>>>>>> Gian
>>>>>>>>
>>>>>>>> On Thu, Nov 17, 2016 at 5:23 PM, Julian Hyde <jh...@apache.org>
>>>>> wrote:
>>>>>>>>
>>>>>>>>> I was wrong earlier… FrameworkConfig already has a
>> getConvertletTable
>>>>>>>>> method. But regarding using FrameworkConfig from within the JDBC
>>>>> driver,
>>>>>>>>> It’s complicated. FrameworkConfig only works if you are “outside”
>>>>>>> Calcite,
>>>>>>>>> whereas CalcitePrepare is when you are customizing from the inside,
>>>>> and
>>>>>>>>> sadly CalcitePrepare does not use a FrameworkConfig.
>>>>>>>>>
>>>>>>>>> Compare and contrast:
>>>>>>>>> * CalcitePrepareImpl.getSqlToRelConverter [
>>>>> https://github.com/apache/
>>>>>>>>> calcite/blob/3f92157d5742dd10f3b828d22d7a753e0a2899cc/core/
>>>>>>> src/main/java/
>>>>>>>>> org/apache/calcite/prepare/CalcitePrepareImpl.java#L1114 <
>>>>>>>>> https://github.com/apache/calcite/blob/3f92157d5742dd10f3b82
>>>>> 8d22d7a75
>>>>>>>>> 3e0a2899cc/core/src/main/java/org/apache/calcite/prepare/
>>>>>>>>> CalcitePrepareImpl.java#L1114> ]
>>>>>>>>> * PlannerImpl.rel [ https://github.com/apache/calcite/blob/
>>>>>>>>> 105bba1f83cd9631e8e1211d262e4886a4a863b7/core/src/main/java/
>>>>>>>>> org/apache/calcite/prepare/PlannerImpl.java#L225 <
>>>>>>>>> https://github.com/apache/calcite/blob/105bba1f83cd9631e8e12
>>>>> 11d262e48
>>>>>>>>> 86a4a863b7/core/src/main/java/org/apache/calcite/prepare/
>>>>>>>>> PlannerImpl.java#L225> ]
>>>>>>>>>
>>>>>>>>> The latter uses a convertletTable sourced from a FrameworkConfig.
>>>>>>>>>
>>>>>>>>> The ideal thing would be to get CalcitePrepareImpl to use a
>>>>> PlannerImpl
>>>>>>> to
>>>>>>>>> do its dirty work. Then “inside” and “outside” would work the same.
>>>>>>> Would
>>>>>>>>> definitely appreciate that as a patch.
>>>>>>>>>
>>>>>>>>> If you choose to go the JDBC driver route, you could override
>>>>>>>>> Driver.createPrepareFactory to produce a sub-class of
>> CalcitePrepare
>>>>>>> that
>>>>>>>>> works for your environment, one with an explicit convertletTable
>>>>> rather
>>>>>>>>> than just using the default.
>>>>>>>>>
>>>>>>>>> Julian
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Nov 17, 2016, at 5:01 PM, Gian Merlino <g...@imply.io> wrote:
>>>>>>>>>>
>>>>>>>>>> Hey Julian,
>>>>>>>>>>
>>>>>>>>>> If the convertlets were customizable with a FrameworkConfig, how
>>>>> would
>>>>>>> I
>>>>>>>>>> use that configure the JDBC driver (given that I'm doing it with
>> the
>>>>>>> code
>>>>>>>>>> upthread)? Or would that suggest using a different approach to
>>>>>>> embedding
>>>>>>>>>> Calcite?
>>>>>>>>>>
>>>>>>>>>> Gian
>>>>>>>>>>
>>>>>>>>>> On Thu, Nov 17, 2016 at 4:02 PM, Julian Hyde <jh...@apache.org>
>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Convertlets have a similar effect to planner rules (albeit they
>>>>> act on
>>>>>>>>>>> scalar expressions, not relational expressions) so people should
>> be
>>>>>>>>> able to
>>>>>>>>>>> change the set of active convertlets.
>>>>>>>>>>>
>>>>>>>>>>> Would you like to propose a change that makes the convertlet
>> table
>>>>>>>>>>> pluggable? Maybe as part of FrameworkConfig? Regardless, please
>>>>> log a
>>>>>>>>> JIRA
>>>>>>>>>>> to track this.
>>>>>>>>>>>
>>>>>>>>>>> And by the way, RexImpTable, which defines how operators are
>>>>>>> implemented
>>>>>>>>>>> by generating java code, should also be pluggable. It’s been on
>> my
>>>>>>> mind
>>>>>>>>> for
>>>>>>>>>>> a long time to allow the “engine” — related to the data format,
>> and
>>>>>>> how
>>>>>>>>>>> code is generated to access fields and evaluate expressions and
>>>>>>>>> operators —
>>>>>>>>>>> to be pluggable.
>>>>>>>>>>>
>>>>>>>>>>> Regarding whether the JDBC driver is the right way to embed
>>>>> Calcite.
>>>>>>>>>>> There’s no easy answer. You might want to embed Calcite as a
>>>>> library
>>>>>>> in
>>>>>>>>>>> your own server (as Drill and Hive do). Or you might want to make
>>>>>>>>> yourself
>>>>>>>>>>> just an adapter that runs inside a Calcite JDBC server (as the
>> CSV
>>>>>>>>> adapter
>>>>>>>>>>> does). Or something in the middle, like what Phoenix does: using
>>>>>>> Calcite
>>>>>>>>>>> for JDBC, SQL, planning, but with your own metadata and runtime
>>>>>>> engine.
>>>>>>>>>>>
>>>>>>>>>>> As long as you build the valuable stuff into planner rules, new
>>>>>>>>> relational
>>>>>>>>>>> operators (if necessary) and use the schema SPI, you should be
>>>>> able to
>>>>>>>>>>> change packaging in the future.
>>>>>>>>>>>
>>>>>>>>>>> Julian
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> On Nov 17, 2016, at 1:59 PM, Gian Merlino <g...@imply.io>
>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hey Calcites,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm working on embedding Calcite into Druid (http://druid.io/,
>>>>>>>>>>>> https://github.com/druid-io/druid/pull/3682) and am running
>> into
>>>>> a
>>>>>>>>>>> problem
>>>>>>>>>>>> that is making me wonder if the approach I'm using makes sense.
>>>>>>>>>>>>
>>>>>>>>>>>> Consider the expression EXTRACT(YEAR FROM __time). Calcite has a
>>>>>>>>> standard
>>>>>>>>>>>> convertlet rule "convertExtract" that changes this into some
>>>>>>> arithmetic
>>>>>>>>>>> on
>>>>>>>>>>>> __time casted to an int type. But Druid has some builtin
>>>>> functions to
>>>>>>>>> do
>>>>>>>>>>>> this, and I'd rather use those than arithmetic (for a bunch of
>>>>>>>>> reasons).
>>>>>>>>>>>> Ideally, in my RelOptRules that convert Calcite rels to Druid
>>>>>>> queries,
>>>>>>>>>>> I'd
>>>>>>>>>>>> see the EXTRACT as a normal RexCall with the time flag and an
>>>>>>>>> expression
>>>>>>>>>>> to
>>>>>>>>>>>> apply it to. That's a lot easier to translate than the
>> arithmetic
>>>>>>>>> stuff,
>>>>>>>>>>>> which I'd have to pattern match and undo first before
>> translating.
>>>>>>>>>>>>
>>>>>>>>>>>> So the problem I have is that I want to disable convertExtract,
>>>>> but I
>>>>>>>>>>> don't
>>>>>>>>>>>> see a way to do that or to swap out the convertlet table.
>>>>>>>>>>>>
>>>>>>>>>>>> The code I'm using to set up a connection is:
>>>>>>>>>>>>
>>>>>>>>>>>> public CalciteConnection createCalciteConnection(
>>>>>>>>>>>> final DruidSchema druidSchema
>>>>>>>>>>>> ) throws SQLException
>>>>>>>>>>>> {
>>>>>>>>>>>> final Properties props = new Properties();
>>>>>>>>>>>> props.setProperty("caseSensitive", "true");
>>>>>>>>>>>> props.setProperty("unquotedCasing", "UNCHANGED");
>>>>>>>>>>>> final Connection connection =
>>>>>>>>>>>> DriverManager.getConnection("jdbc:calcite:", props);
>>>>>>>>>>>> final CalciteConnection calciteConnection =
>>>>>>>>>>>> connection.unwrap(CalciteConnection.class);
>>>>>>>>>>>> calciteConnection.getRootSchema().setCacheEnabled(false);
>>>>>>>>>>>> calciteConnection.getRootSchema().add(DRUID_SCHEMA_NAME,
>>>>>>>>>>> druidSchema);
>>>>>>>>>>>> return calciteConnection;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> This CalciteConnection is then used by the Druid HTTP server to
>>>>>>> offer a
>>>>>>>>>>> SQL
>>>>>>>>>>>> API.
>>>>>>>>>>>>
>>>>>>>>>>>> Is there some way to swap out the convertlet table that I'm
>>>>> missing?
>>>>>>>>>>>>
>>>>>>>>>>>> Also, just in general, am I going about this the right way? Is
>>>>> using
>>>>>>>>> the
>>>>>>>>>>>> JDBC driver the right way to embed Calcite? Or should I be
>> calling
>>>>>>> into
>>>>>>>>>>> it
>>>>>>>>>>>> at some lower level?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> Gian
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>
>>
>>