Re: Building a Calcite Adapter

Jon Pither Fri, 01 May 2020 09:42:21 -0700

That last plan should read:

CruxToEnumerableConverter
  CruxProject(NAME=[$1], EXPR$1=[*(2, $3)])
    CruxTableScan(table=[[crux, PERSON]])


The point being that it would great to get EnumerableCalc going.

I did consult the JDBC adapter and saw it makes use of a JDBCCalc rule -
and I attempted to get a similar rule firing for us - but no joy (I also
couldn't find an example query plan of JdbcCalc in action).

On Fri, 1 May 2020 at 17:39, Jon Pither <[email protected]> wrote:

> Hi,
>
> Thanks for bearing with.
>
> I'm stuck as to when EnumerableCalc fires vs Project.
>
> We added a Project rule because it's a) desirable so we only pull back
> fields we need from the underlying db, and b) we needed it for inner join
> joins to be pushed down to our DB.
>
> That is to say that with us implementing a LogicalProject, our query plan
> for "SELECT PERSON.NAME, PLANET.NAME AS PLANET FROM PERSON INNER JOIN
> PLANET ON PERSON.PLANET = PLANET.NAME" is:
>
>   CruxToEnumerableConverter
>     CruxProject(NAME=[$1], PLANET=[$0])
>       CruxJoin(condition=[=($2, $0)], joinType=[inner])
>         CruxProject(NAME=[$1])
>           CruxTableScan(table=[[crux, PLANET]])
>         CruxProject(NAME=[$1], PLANET=[$2])
>           CruxTableScan(table=[[crux, PERSON]])
>
> If I remove the Project rule:
>
>       EnumerableCalc(expr#0..2=[{inputs}], NAME=[$t0], PLANET=[$t2])
>         EnumerableHashJoin(condition=[=($1, $2)], joinType=[inner])
>           EnumerableCalc(expr#0..3=[{inputs}], NAME=[$t1], PLANET=[$t2])
>             CruxToEnumerableConverter
>               CruxTableScan(table=[[crux, PERSON]])
>           EnumerableCalc(expr#0..2=[{inputs}], NAME=[$t1])
>             CruxToEnumerableConverter
>               CruxTableScan(table=[[crux, PLANET]])
>
> So looks we need it for efficient joins - granted I need do to more
> learning to understand exactly why the EnumerableCalc couldn't work at the
> top level only.
>
> The problem following on from this, is that with us implementing
> LogicalProject, EnumerableCalc seems nearly taken out entirely. Therefore
> queries such as:
>
> "SELECT NAME, (2 * PERSON.AGE)  FROM PERSON
>
> Have a plan:
>
>   CruxToEnumerableConverter
>     CruxProject(NAME=[$1], FOO=[true])
>       CruxTableScan(table=[[crux, PERSON]]
>
> I'm not sure how to get EnumerableCalc firing again so that the above
> query works.
>
> Regards,
>
> Jon.
>
>
> On Fri, 1 May 2020 at 11:09, Jon Pither <[email protected]> wrote:
>
>> Thanks. Another Q:
>>
>> In Crux we store documents, and values within those documents can be
>> Clojure.lang.Keywords. I'm still deciding how to handle them with Calcite -
>> one option being simply to wash them into Strings via the enumerator, which
>> is our first pass.
>>
>> I did experiment with defining this keyword column type as
>> SqlTypeName/OTHER and letting these keywords pass through. This works for
>> Calcite (keywords are in the results returned to the client), but not -
>> understandably - for Avatica, where I get "Remote driver error:
>> RuntimeException: Unhandled type in Frame: class clojure.lang.Keyword".
>>
>> Is there a way achieve both goals - so that using Calcite in-process, the
>> keywords come through the result-set, but then we hook in via Avatica so
>> that before results are serisalised on the wire, they get converted to
>> Strings?
>>
>> Related: to allow users to filter using keywords we offer a function,
>> i.e.: SELECT NAME FROM PERSON WHERE GENDER = KEYWORD('female'), the
>> keywords then get passed down to our underlying query engine.
>>
>> Jon.
>>
>> On Thu, 30 Apr 2020 at 18:30, Julian Hyde <[email protected]> wrote:
>>
>>>
>>>
>>> > On Apr 30, 2020, at 8:56 AM, Jon Pither <[email protected]> wrote:
>>> >
>>> > We went down the route of wrapping Calcite with our own JDBC driver
>>> that
>>> > strips out the `VALIDTIME AS OF (...)` from ``VALIDTIME AS OF (...)
>>> SELECT
>>> > * FROM FOO`. We do this by overriding CalcitePrepareImpl and adding
>>> > internalParameters to the CalciteSignature, that our enumerator then
>>> uses
>>> > when executing the actual query. Any feedback on this approach is
>>> welcome
>>> > :-)
>>>
>>> Overriding the prepare process with, say, a different parser or
>>> validator is a bit messy but it sounds like you’re doing the right thing.
>>>
>>> > Another question. Our underlying DB supports datetime fields, returning
>>> > java.util.Dates back from queries for datetime columns. I'm thinking
>>> that
>>> > we ought be able to map these Dates through to a column we define using
>>> > SqlTypeName/TIMESTAMP. To get this to work though, our enumerator has
>>> to
>>> > convert our dates into millis, for Calcite to then convert them back
>>> into
>>> > java.util.Dates.. Feel like I missing something obvious to skip this
>>> > conversion?
>>>
>>> You should probably use java.sql.Date, java.sql.Timestamp etc. They are
>>> sub-classes of java.util.Date but convey information about the semantics.
>>>
>>> Our Enumerable convention decided to use Java long values (or
>>> java.lang.Long if nullable) for SQL TIMESTAMP values, and int values for
>>> DATE and TIME values. These are lighter weight than java.sql.Timestamp, and
>>> also make it less likely to inadvertently do computations in the local time
>>> zone.
>>>
>>> If the enumerator is just passing values through, then conversion to and
>>> from might seem like significant effort. But sorry, it’s just how it is.
>>>
>>> Julian
>>>
>>>

Re: Building a Calcite Adapter

Reply via email to