[ 
https://issues.apache.org/jira/browse/SPARK-47773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Jia updated SPARK-47773:
---------------------------
    Description: 
SPIP doc: 
https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing

This 
[SPIP|https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing]
 outlines the integration of Gluten's physical plan conversion, validation, and 
fallback framework into Apache Spark. The goal is to enhance Spark's 
flexibility and robustness in executing physical plans and to leverage Gluten's 
performance optimizations. Currently, Spark lacks an official cross-platform 
execution support for physical plans. Gluten's mechanism, which employs the 
Substrait standard, can convert and optimize Spark's physical plans, thus 
improving portability, interoperability, and execution efficiency.

The design proposal advocates for the incorporation of the TransformSupport 
interface and its specialized variants—LeafTransformSupport, 
UnaryTransformSupport, and BinaryTransformSupport. These are instrumental in 
streamlining the conversion of different operator types into a Substrait-based 
common format. The validation phase entails a thorough assessment of the 
Substrait plan against native backends to ensure compatibility. In instances 
where validation does not succeed, Spark's native operators will be deployed, 
with requisite transformations to adapt data formats accordingly. The proposal 
emphasizes the centrality of the plan transformation phase, positing it as the 
foundational step. The subsequent validation and fallback procedures are slated 
for consideration upon the successful establishment of the initial phase.

The integration of Gluten into Spark has already shown significant performance 
improvements with ClickHouse and Velox backends and has been successfully 
deployed in production by several customers. 

  was:
This 
[SPIP|https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing]
 outlines the integration of Gluten's physical plan conversion, validation, and 
fallback framework into Apache Spark. The goal is to enhance Spark's 
flexibility and robustness in executing physical plans and to leverage Gluten's 
performance optimizations. Currently, Spark lacks an official cross-platform 
execution support for physical plans. Gluten's mechanism, which employs the 
Substrait standard, can convert and optimize Spark's physical plans, thus 
improving portability, interoperability, and execution efficiency.

The design proposal advocates for the incorporation of the TransformSupport 
interface and its specialized variants—LeafTransformSupport, 
UnaryTransformSupport, and BinaryTransformSupport. These are instrumental in 
streamlining the conversion of different operator types into a Substrait-based 
common format. The validation phase entails a thorough assessment of the 
Substrait plan against native backends to ensure compatibility. In instances 
where validation does not succeed, Spark's native operators will be deployed, 
with requisite transformations to adapt data formats accordingly. The proposal 
emphasizes the centrality of the plan transformation phase, positing it as the 
foundational step. The subsequent validation and fallback procedures are slated 
for consideration upon the successful establishment of the initial phase.

The integration of Gluten into Spark has already shown significant performance 
improvements with ClickHouse and Velox backends and has been successfully 
deployed in production by several customers. 


> Enhancing the Flexibility of Spark's Physical Plan to Enable Execution on 
> Various Native Engines
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-47773
>                 URL: https://issues.apache.org/jira/browse/SPARK-47773
>             Project: Spark
>          Issue Type: Epic
>          Components: SQL
>    Affects Versions: 3.5.1
>            Reporter: Ke Jia
>            Priority: Major
>
> SPIP doc: 
> https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing
> This 
> [SPIP|https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing]
>  outlines the integration of Gluten's physical plan conversion, validation, 
> and fallback framework into Apache Spark. The goal is to enhance Spark's 
> flexibility and robustness in executing physical plans and to leverage 
> Gluten's performance optimizations. Currently, Spark lacks an official 
> cross-platform execution support for physical plans. Gluten's mechanism, 
> which employs the Substrait standard, can convert and optimize Spark's 
> physical plans, thus improving portability, interoperability, and execution 
> efficiency.
> The design proposal advocates for the incorporation of the TransformSupport 
> interface and its specialized variants—LeafTransformSupport, 
> UnaryTransformSupport, and BinaryTransformSupport. These are instrumental in 
> streamlining the conversion of different operator types into a 
> Substrait-based common format. The validation phase entails a thorough 
> assessment of the Substrait plan against native backends to ensure 
> compatibility. In instances where validation does not succeed, Spark's native 
> operators will be deployed, with requisite transformations to adapt data 
> formats accordingly. The proposal emphasizes the centrality of the plan 
> transformation phase, positing it as the foundational step. The subsequent 
> validation and fallback procedures are slated for consideration upon the 
> successful establishment of the initial phase.
> The integration of Gluten into Spark has already shown significant 
> performance improvements with ClickHouse and Velox backends and has been 
> successfully deployed in production by several customers. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to