Re: Building a Calcite Adapter

Jon Pither Fri, 01 May 2020 09:37:56 -0700

Hi,

Thanks for bearing with.


I'm stuck as to when EnumerableCalc fires vs Project.

We added a Project rule because it's a) desirable so we only pull back
fields we need from the underlying db, and b) we needed it for inner join
joins to be pushed down to our DB.

That is to say that with us implementing a LogicalProject, our query plan
for "SELECT PERSON.NAME, PLANET.NAME AS PLANET FROM PERSON INNER JOIN
PLANET ON PERSON.PLANET = PLANET.NAME" is:

  CruxToEnumerableConverter
    CruxProject(NAME=[$1], PLANET=[$0])
      CruxJoin(condition=[=($2, $0)], joinType=[inner])
        CruxProject(NAME=[$1])
          CruxTableScan(table=[[crux, PLANET]])
        CruxProject(NAME=[$1], PLANET=[$2])
          CruxTableScan(table=[[crux, PERSON]])

If I remove the Project rule:

      EnumerableCalc(expr#0..2=[{inputs}], NAME=[$t0], PLANET=[$t2])
        EnumerableHashJoin(condition=[=($1, $2)], joinType=[inner])
          EnumerableCalc(expr#0..3=[{inputs}], NAME=[$t1], PLANET=[$t2])
            CruxToEnumerableConverter
              CruxTableScan(table=[[crux, PERSON]])
          EnumerableCalc(expr#0..2=[{inputs}], NAME=[$t1])
            CruxToEnumerableConverter
              CruxTableScan(table=[[crux, PLANET]])

So looks we need it for efficient joins - granted I need do to more
learning to understand exactly why the EnumerableCalc couldn't work at the
top level only.

The problem following on from this, is that with us implementing
LogicalProject, EnumerableCalc seems nearly taken out entirely. Therefore
queries such as:

"SELECT NAME, (2 * PERSON.AGE)  FROM PERSON

Have a plan:

  CruxToEnumerableConverter
    CruxProject(NAME=[$1], FOO=[true])
      CruxTableScan(table=[[crux, PERSON]]

I'm not sure how to get EnumerableCalc firing again so that the above query
works.

Regards,

Jon.


On Fri, 1 May 2020 at 11:09, Jon Pither <[email protected]> wrote:

> Thanks. Another Q:
>
> In Crux we store documents, and values within those documents can be
> Clojure.lang.Keywords. I'm still deciding how to handle them with Calcite -
> one option being simply to wash them into Strings via the enumerator, which
> is our first pass.
>
> I did experiment with defining this keyword column type as
> SqlTypeName/OTHER and letting these keywords pass through. This works for
> Calcite (keywords are in the results returned to the client), but not -
> understandably - for Avatica, where I get "Remote driver error:
> RuntimeException: Unhandled type in Frame: class clojure.lang.Keyword".
>
> Is there a way achieve both goals - so that using Calcite in-process, the
> keywords come through the result-set, but then we hook in via Avatica so
> that before results are serisalised on the wire, they get converted to
> Strings?
>
> Related: to allow users to filter using keywords we offer a function,
> i.e.: SELECT NAME FROM PERSON WHERE GENDER = KEYWORD('female'), the
> keywords then get passed down to our underlying query engine.
>
> Jon.
>
> On Thu, 30 Apr 2020 at 18:30, Julian Hyde <[email protected]> wrote:
>
>>
>>
>> > On Apr 30, 2020, at 8:56 AM, Jon Pither <[email protected]> wrote:
>> >
>> > We went down the route of wrapping Calcite with our own JDBC driver that
>> > strips out the `VALIDTIME AS OF (...)` from ``VALIDTIME AS OF (...)
>> SELECT
>> > * FROM FOO`. We do this by overriding CalcitePrepareImpl and adding
>> > internalParameters to the CalciteSignature, that our enumerator then
>> uses
>> > when executing the actual query. Any feedback on this approach is
>> welcome
>> > :-)
>>
>> Overriding the prepare process with, say, a different parser or validator
>> is a bit messy but it sounds like you’re doing the right thing.
>>
>> > Another question. Our underlying DB supports datetime fields, returning
>> > java.util.Dates back from queries for datetime columns. I'm thinking
>> that
>> > we ought be able to map these Dates through to a column we define using
>> > SqlTypeName/TIMESTAMP. To get this to work though, our enumerator has to
>> > convert our dates into millis, for Calcite to then convert them back
>> into
>> > java.util.Dates.. Feel like I missing something obvious to skip this
>> > conversion?
>>
>> You should probably use java.sql.Date, java.sql.Timestamp etc. They are
>> sub-classes of java.util.Date but convey information about the semantics.
>>
>> Our Enumerable convention decided to use Java long values (or
>> java.lang.Long if nullable) for SQL TIMESTAMP values, and int values for
>> DATE and TIME values. These are lighter weight than java.sql.Timestamp, and
>> also make it less likely to inadvertently do computations in the local time
>> zone.
>>
>> If the enumerator is just passing values through, then conversion to and
>> from might seem like significant effort. But sorry, it’s just how it is.
>>
>> Julian
>>
>>

Re: Building a Calcite Adapter

Reply via email to