[ 
https://issues.apache.org/jira/browse/SPARK-47773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Jia updated SPARK-47773:
---------------------------
    Description: This 
[SPIP|https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing]
 outlines the integration of Gluten's physical plan conversion, validation, and 
fallback framework into Apache Spark. The goal is to enhance Spark's 
flexibility and robustness in executing physical plans and to leverage Gluten's 
performance optimizations. Currently, Spark lacks an official cross-platform 
execution support for physical plans. Gluten's mechanism, which employs the 
Substrait standard, can convert and optimize Spark's physical plans, thus 
improving portability, interoperability, and execution efficiency. The design 
proposal advocates for the incorporation of the TransformSupport interface and 
its specialized variants—LeafTransformSupport, UnaryTransformSupport, and 
BinaryTransformSupport. These are instrumental in streamlining the conversion 
of different operator types into a Substrait-based common format. The 
validation phase entails a thorough assessment of the Substrait plan against 
native backends to ensure compatibility. In instances where validation does not 
succeed, Spark's native operators will be deployed, with requisite 
transformations to adapt data formats accordingly. The proposal emphasizes the 
centrality of the plan transformation phase, positing it as the foundational 
step. The subsequent validation and fallback procedures are slated for 
consideration upon the successful establishment of the initial phase.  The 
integration of Gluten into Spark has already shown significant performance 
improvements with ClickHouse and Velox backends and has been successfully 
deployed in production by several customers.   (was: This SPIP outlines the 
integration of Gluten's physical plan conversion, validation, and fallback 
framework into Apache Spark. The goal is to enhance Spark's flexibility and 
robustness in executing physical plans and to leverage Gluten's performance 
optimizations. Currently, Spark lacks an official cross-platform execution 
support for physical plans. Gluten's mechanism, which employs the Substrait 
standard, can convert and optimize Spark's physical plans, thus improving 
portability, interoperability, and execution efficiency. The design proposal 
advocates for the incorporation of the TransformSupport interface and its 
specialized variants—LeafTransformSupport, UnaryTransformSupport, and 
BinaryTransformSupport. These are instrumental in streamlining the conversion 
of different operator types into a Substrait-based common format. The 
validation phase entails a thorough assessment of the Substrait plan against 
native backends to ensure compatibility. In instances where validation does not 
succeed, Spark's native operators will be deployed, with requisite 
transformations to adapt data formats accordingly. The proposal emphasizes the 
centrality of the plan transformation phase, positing it as the foundational 
step. The subsequent validation and fallback procedures are slated for 
consideration upon the successful establishment of the initial phase.  The 
integration of Gluten into Spark has already shown significant performance 
improvements with ClickHouse and Velox backends and has been successfully 
deployed in production by several customers. )

> Enhancing the Flexibility of Spark's Physical Plan to Enable Execution on 
> Various Native Engines
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-47773
>                 URL: https://issues.apache.org/jira/browse/SPARK-47773
>             Project: Spark
>          Issue Type: Epic
>          Components: SQL
>    Affects Versions: 3.5.1
>            Reporter: Ke Jia
>            Priority: Major
>
> This 
> [SPIP|https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing]
>  outlines the integration of Gluten's physical plan conversion, validation, 
> and fallback framework into Apache Spark. The goal is to enhance Spark's 
> flexibility and robustness in executing physical plans and to leverage 
> Gluten's performance optimizations. Currently, Spark lacks an official 
> cross-platform execution support for physical plans. Gluten's mechanism, 
> which employs the Substrait standard, can convert and optimize Spark's 
> physical plans, thus improving portability, interoperability, and execution 
> efficiency. The design proposal advocates for the incorporation of the 
> TransformSupport interface and its specialized variants—LeafTransformSupport, 
> UnaryTransformSupport, and BinaryTransformSupport. These are instrumental in 
> streamlining the conversion of different operator types into a 
> Substrait-based common format. The validation phase entails a thorough 
> assessment of the Substrait plan against native backends to ensure 
> compatibility. In instances where validation does not succeed, Spark's native 
> operators will be deployed, with requisite transformations to adapt data 
> formats accordingly. The proposal emphasizes the centrality of the plan 
> transformation phase, positing it as the foundational step. The subsequent 
> validation and fallback procedures are slated for consideration upon the 
> successful establishment of the initial phase.  The integration of Gluten 
> into Spark has already shown significant performance improvements with 
> ClickHouse and Velox backends and has been successfully deployed in 
> production by several customers. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to