I hear some reasonable concerns around locking us into Calcite JDBC. I
agree that there are quite a few unknowns around if it would work at all.

I didn't think of option three, thanks for the suggestion Anton! I think
building JDBC from scratch would be a large, tedious project. We should
leverage whatever libraries we can to avoid the protocol compatibility work
on that front. However I do agree that rewriting the layer between Calcite
Avatica and the Calcite Planner might be the right path forward. I've stuck
to option one for now knowing that we will revisit options two and three in
a few months once some of our other feature concerns have a clear path
forward.

Pull request is here: https://github.com/apache/beam/pull/5399

Andrew

On Wed, May 16, 2018 at 10:32 AM Anton Kedin <ke...@google.com> wrote:

> Among these options I would lean towards option 1. We already support a
> lot of infrastructure to call into Calcite for non-JDBC path, so adding
> some code to generate config does not seem like a big of a deal, especially
> if it will be a supported way at some point in Calcite.
>
> Pulling implementation RelNode out of JDBC seems to bring a lot more
> unknowns:
>  - it feels it goes against the JDBC approach as we're basically going
> around JDBC result sets;
>  - we will expose 2 ways to extract results, with different schemas,
> types, etc;
>
> I think the third option is to implement the JDBC driver ourselves without
> using Calcite JDBC infrastructure. This way we have the only path into
> Calcite and control everything. I don't know how much effort it would take
> to implement a functional JDBC to cover our use cases though, but I think
> it's on a similar order of magnitude as we don't have to implement a lot of
> the API in the beginning, e.g. transactions, cursors, DML.
>
>
> On Wed, May 16, 2018 at 10:15 AM Kenneth Knowles <k...@google.com> wrote:
>
>> IIUC in #2 Beam SQL would live on the other side of a JDBC boundary from
>> any use of it (including the BeamSQL transform). I'm a bit worried we'll
>> have a problem plumbing all the info we need, either now or later,
>> especially if we make funky extensions to support our version of SQL.
>>
>> Kenn
>>
>> On Wed, May 16, 2018 at 10:08 AM Andrew Pilloud <apill...@google.com>
>> wrote:
>>
>>> I'm currently adding JDBC support to Beam SQL! Unfortunately Calcite has
>>> two distinct entry points, one for JDBC and one for everything else (see
>>> CALCITE-1525). Eventually that will change, but I'd like to avoid having
>>> two versions of Beam SQL until Calcite converges on a single path for
>>> parsing SQL. Here are the options I am looking at:
>>>
>>> 1. Make JDBC the source of truth for Calcite config and state. Generate
>>> a FrameworkConfig based on the JDBC connection and continue to use the
>>> non-JDBC interface to Calcite. This option comes with the risk that the two
>>> paths into Calcite will diverge (as there is a bunch of code copied from
>>> Calcite to generate the config), but is the easiest to implement and
>>> understand.
>>>
>>> 2. Make JDBC the only path into Calcite. Use prepareStatement and unwrap
>>> to extract a BeamRelNode out of the JDBC interface. This eliminates a
>>> significant amount of code in Beam, but the unwrap path is a little
>>> convoluted.
>>>
>>> Both options leave the user facing non-JDBC interface to Beam SQL
>>> unchanged, these changes are internal.
>>>
>>> Andrew
>>>
>>

Reply via email to