Re: Using Calcite with Python

Nicola Vitucci Mon, 31 Jan 2022 15:32:09 -0800

Hi Eugen, Michael, Gavin,

Thank you very much for your input. Answering to your suggestions:


- Phoenix client: I saw it but decided not to use it because it does not
seem very active and up to date (its Avatica version is 1.10, while latest
is 1.20). I may still give it a try though.
- Arrow Flight: I think it can be very useful especially, like Michael
mentioned, if it were integrated with Avatica as a transport; at the
moment, though, it is not.

I am basically looking for a (relatively) easy and ready to implement, easy
to keep up to date, and reasonably performant solution. Although it incurs
some overhead, a solution based on Python + Java seems to me the most
reasonable for the time being. Do you have any other suggestions or
recommendations?

Thanks again,

Nicola



Il giorno lun 31 gen 2022 alle ore 17:04 Michael Mior <[email protected]> ha
scritto:

> Flight is definitely another consideration for the future. Personally I
> think it would be most interesting to integrate Flight with Avatica as an
> alternative transport. But it would certainly also be useful to allow the
> Arrow adapter to connect to any Flight endpoint.
>
> --
> Michael Mior
> [email protected]
>
>
> Le lun. 31 janv. 2022 à 10:00, Gavin Ray <[email protected]> a écrit :
>
> > This is really interesting stuff you've done in the example notebooks
> >
> > Nicola & Michael, I wonder if you could benefit from the
> recently-released
> > Arrow Flight SQL?
> >
> >
> https://www.dremio.com/subsurface/arrow-flight-and-arrow-flight-sql-accelerating-data-movement/
> >
> > I have asked Jacques about this a bit -- it's meant to be a
> standardization
> > for communicating SQL queries and metadata with Arrow.
> > I'm not intimately familiar with it, but it seems like it could be a good
> > base to build a Calcite backend for Arrow from?
> >
> > They have a pretty thorough Java example in the repository:
> >
> >
> https://github.com/apache/arrow/blob/968e6ea488c939c0e1f2bfe339a5a9ed1aed603e/java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlExample.java#L169-L180
> >
> > On Mon, Jan 31, 2022 at 8:47 AM Michael Mior <[email protected]> wrote:
> >
> > > You may want to keep an eye on CALCITE-2040 (
> > > https://issues.apache.org/jira/browse/CALCITE-2040). I have a student
> > who
> > > is working on a Calcite adapter for Apache Arrow. We're basically hung
> up
> > > waiting on the Arrow team to release a compatible JAR. This still won't
> > > fully solve your problem though as the first version of the adapter is
> > only
> > > capable of reading from Arrow files. However, the goal is eventually to
> > > allow passing a memory reference into the adapter so that it would be
> > > possible to make use of Arrow data which is constructed in-memory
> > > elsewhere.
> > > --
> > > Michael Mior
> > > [email protected]
> > >
> > >
> > > Le dim. 30 janv. 2022 à 17:36, Nicola Vitucci <
> [email protected]>
> > a
> > > écrit :
> > >
> > > > Hi all,
> > > >
> > > > What would be the best way to use Calcite with Python? I've come up
> > with
> > > > two potential solutions:
> > > >
> > > > - using the jaydebeapi package, to connect via the JDBC driver
> directly
> > > > from a JVM created via jpype;
> > > > - using Apache Arrow via the pyarrow package, to connect in basically
> > the
> > > > same way but creating Arrow objects with JdbcToArrowUtils (and
> > optionally
> > > > converting them to Pandas).
> > > >
> > > > Although the former is more straightforward, the latter allows to
> > achieve
> > > > better performance (see [1] for instance) since it's exactly what
> Arrow
> > > is
> > > > for. I've created two Jupyter notebooks [2] showing each solution.
> What
> > > > would you recommend? Is there an even better approach?
> > > >
> > > > Thanks,
> > > >
> > > > Nicola
> > > >
> > > > [1] https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html
> > > > [2]
> > > https://github.com/nvitucci/calcite-sparql/tree/v0.0.2/examples/python
> > > >
> > >
> >
>

Re: Using Calcite with Python

Reply via email to