Hello,

My name is George and I am an undergraduate computer science student. I am
doing some research for my diploma thesis about query optimization on
distributed systems. After reading some basics about Calcite project, I
thought I could use it as an SQL optimizer on top of Spark.
I have a Hadoop cluster running on multiple machines, and I run SQl queries
with SparkSQL on data saved in a Data Warehouse (HIVE). My goal is to
optimize certain queries by pushing rules and functions down to the nodes
with a framework like Calcite. However, I haven't found any related
documentation and I am not sure if it is even possible to access the
metadata of hive through Calcite and run the optimizations on Spark. Can
you help me?

Thank you in advance.

Reply via email to