Hello Everyone,

   We are working on revamping the query engine for one of our internal
database & we are planning to use Calcite.
I am new to Calcite & its concepts. Calcite is pretty cool & fits our needs
so well.

This is a partitioned database & we need a distributed query execution
capability. I went through some pointers here[1].
I am still not sure on how to proceed with this.

Scenario:
1. I would like to have a table say FOO_TABLE. This will be top level table.
2. From our Metadatastore, We know that the partition for FOO_TABLE are
    FOO_PART1, FOO_PART2 & FOO_PART3.
3. SQL_QUERY: select C1, SUM(C2), AVG(C3) from FOO_TABLE GROUP BY C1.

Questions:
1. How do I convert SQL_QUERY to the individual sub queries ?
2. Sub Queries for each partition gets executed on different nodes(we have
partition-> node affinity information)
2. How do I combine them back ?
3. Would you have any pointers on how to introduce the parallelism needed ?

Appreciate your help in this regard. Thanks!

~Ravi

[1]
https://mail-archives.apache.org/mod_mbox/calcite-dev/201603.mbox/%[email protected]%3E

Reply via email to