Re: Requesting Information Regarding Data Federation

Stamatis Zampetakis Thu, 21 Jul 2022 02:26:27 -0700

Hi Pranav,

A very simplistic example of using Calcite for data integration can be
found here [1] along with some links to presentations and relevant material.


Apart from Apache Drill, Apache Hive is using Calcite for executing
federated queries. The main entry point is CalcitePlannerAction#apply [2]
where most of the Calcite configuration is done.

Best,
Stamatis

[1] https://github.com/zabetak/cy-calcite-tutorial
[2]
https://github.com/apache/hive/blob/834308091624c1a69cba7a8b97919ed1ff0fc616/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L1646

On Thu, Jul 21, 2022 at 2:01 AM Charles Givre <[email protected]> wrote:

> Hi Pranav,
> You might want to take a look at Apache Drill, as it uses Calcite as a
> query planner and can executed federated queries against a pretty wide
> array of data sets.
> Best,
> -- C
>
> > On Jul 20, 2022, at 6:56 PM, Pranav Deshpande <
> [email protected]> wrote:
> >
> > Dear Apache Calcite Team,
> > I am trying to learn Calcite and wish to build a poc for data federation.
> >
> > In the video here, https://www.youtube.com/watch?v=4JAOkLKrcYE, somehow
> the
> > presenter and his team managed to squash parts of the Relational Nodes
> into
> > "Spark Tables" and then Spark handled the execution of those.
> >
> > How do I exactly go about doing this?
> >
> > As per this discussion I understand that one has to create a RelOptRule
> to
> > do the same.
> >
> > Also, one has to somehow define the cost (I don't know how to do this).
> >
> > Is there a simple tutorial which demonstrates the basics of this? Like
> some
> > kind of simple implementation with ListTable etc.
> >
> > Thanks & Regards,
> > Pranav
>
>

Re: Requesting Information Regarding Data Federation

Reply via email to