Re: RelNode => Spark Logical Plan converter

Jiajun Xie Fri, 13 Oct 2023 06:10:54 -0700

If your SQL is simple, you can refer to Kylin[1].

I like RelNode To Spark SQL because I had trouble with "RelNode=>Spark
Logical Plan":
    1. Coupling Spark versions can result in difficulties during upgrades.
        I have to spend a lot of time from Spark 2.3 to 3.0, and Spark 3.0
to 3.2.
    2. Changes in metadata can cause problems
         After converting to RelNode, users may change metadata(Table or
Column).
        Then you use old RelNode to generate Spark Logical Plan.
        The Logical Plan may be error because Spark reads new metadata.
    3. SQL functions require special handling
        When you write SQL, you can use any function because it is a
string.
        But when you use Spark Logical Plan, you should use the correct
spark expression.
    More issues...
Coral[2] is a project that uses RelNode to SQL.


[1]
https://github.com/apache/kylin/tree/0e0d8a125ea81f735e66e38f32ef722622d77dc0/kylin-spark-project/kylin-spark-query/src/main/scala/org/apache/kylin/query/runtime/plans
[2] https://github.com/linkedin/coral

On Fri, 13 Oct 2023 at 10:27, LakeShen <[email protected]> wrote:

> Maybe you could see something in Hive,as far as I know, Hive uses Calcite,
> but eventually the plan will be transferred to Hive's own execution
> plan,Spark and Hive themselves are also very compatible.
> Now I don't find any project to convert Calcite Logical Plan to Spark
> Logical Plan directly,but you could do this by yourself.However, there are
> many differences between Calcite and Spark that need to be considered, such
> as differences in functions, operators of relational algebra trees,
> differences in data types, etc.
> I think that Calcite => Substrait => Spark is better way to do this,you
> could leverage the power of the Calcite community and the Substrait
> community.
>
> Best,
> LakeShen
>
>
> Julian Hyde <[email protected]> 于2023年10月13日周五 08:20写道：
>
> > There aren’t any plans, but I like the idea of transforming between the
> > intermediate languages (Calcite algebra, Spark algebra, SQL of any
> dialect,
> > Substrate).
> >
> > In particular I was thinking of translating Calcite algebra —> Substrate,
> > and Substrate —> any supported SQL dialect. The former would be done in
> > Calcite, the latter could be done in a sister project of Substrate.
> >
> > Julian
> >
> >
> > > On Oct 12, 2023, at 9:07 AM, Guillaume Masse <
> > [email protected]> wrote:
> > >
> > > Hi All,
> > >
> > > We use Apache Calcite to transform SQL then we want to run the logical
> > plan
> > > on our spark cluster. Currently we use RelToSqlConverter and execute
> the
> > > result with Spark SQL. I was wondering if we could just execute from a
> > > Spark logical plan (
> > >
> >
> https://github.com/apache/spark/blob/b0576fff9b72880cd81a9d22c044dec329bc67d0/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L211-L213
> > > ).
> > >
> > > Are there any project that goes from Calcite Logical Plan to Spark
> > Logical
> > > Plan directly?
> > >
> > > I know there is one possibility: Calcite => Substrait => Spark
> > > Substrait => Calcite via substrate-java
> > > <https://github.com/substrait-io/substrait-java>
> > > Substrait => Spark via gluten
> > > <
> >
> https://github.com/oap-project/gluten/tree/main/substrait/substrait-spark>
> > >
> > >
> > > --
> > > Guillaume Massé
> > > [Gee-OHM]
> > > (马赛卫)
> >
> >
>

Re: RelNode => Spark Logical Plan converter

Reply via email to