Interesting question. Someone told me Spark didn't start (~2012) with SQL queries (Introduced ~2014) support in mind. Probably only python-based jobs so Catalyst was enough then which makes sense to me but I can't confirm that.
On Mon, Jan 13, 2020 at 4:30 PM Michael Mior <[email protected]> wrote: > This discussion on the Spark mailing list may be interesting to follow :) > > -- > Michael Mior > [email protected] > > > ---------- Forwarded message --------- > De : newroyker <[email protected]> > Date: lun. 13 janv. 2020 à 09:25 > Subject: Why Apache Spark doesn't use Calcite? > To: <[email protected]> > > > Was there a qualitative or quantitative benchmark done before a design > decision was made not to use Calcite? > > Are there limitations (for heuristic based, cost based, * aware optimizer) > in Calcite, and frameworks built on top of Calcite? In the context of big > data / TCPH benchmarks. > > I was unable to dig up anything concrete from user group / Jira. Appreciate > if any Catalyst veteran here can give me pointers. Trying to defend > Spark/Catalyst. > > > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: [email protected] >
