calcite sql applied on spark dataframes

Ben Teeuwen Tue, 29 Jan 2019 08:56:38 -0800

Hi all,

I'm interested in trying out Calcite with the goal of being able to apply
the same SQL statement in 2 different setups. One is an offline batch
setting with a Spark dataframe. So this could be to do basic additions or
subtractions using multiple columns, or datediff operations on 2 timestamp
columns. As it is Spark, the SQL statement could be applied to millions of
rows in parallel. The other setup is in an online setup with Java services
processing individual requests. Spark has its own powerful SQL engine, but
the goal here would be to try and use Calcite and rule out
incompatibilities between the Spark SQL engine and Calcite's engine used in
java land.


Does anyone have experience with such an approach? I scanned the mailing
archive for messages about Spark but haven't seen it.

Ben

calcite sql applied on spark dataframes

Reply via email to