[jira] [Comment Edited] (SPARK-47773) Enhancing the Flexibility of Spark's Physical Plan to Enable Execution on Various Native Engines
[ https://issues.apache.org/jira/browse/SPARK-47773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839497#comment-17839497 ] Ke Jia edited comment on SPARK-47773 at 4/22/24 6:22 AM: - We have refined the above SPIP in accordance with the specifications from the Spark community. The latest version of the SPIP is now available [here|https://docs.google.com/document/d/1oY26KtqXoJJNHbAhtmVgaXSVt6NlO6t1iWYEuGvCc1s/edit?usp=sharing]. Welcome and value your suggestions and comments. was (Author: jk_self): We have refined the above [SPIP |[https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing]]in accordance with the specifications from the Spark community. The latest version of the SPIP is now available [here|https://docs.google.com/document/d/1oY26KtqXoJJNHbAhtmVgaXSVt6NlO6t1iWYEuGvCc1s/edit?usp=sharing]. Welcome and value your suggestions and comments. > Enhancing the Flexibility of Spark's Physical Plan to Enable Execution on > Various Native Engines > > > Key: SPARK-47773 > URL: https://issues.apache.org/jira/browse/SPARK-47773 > Project: Spark > Issue Type: Epic > Components: SQL >Affects Versions: 4.0.0 >Reporter: Ke Jia >Priority: Major > > SPIP doc: > https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing > This > [SPIP|https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing] > outlines the integration of Gluten's physical plan conversion, validation, > and fallback framework into Apache Spark. The goal is to enhance Spark's > flexibility and robustness in executing physical plans and to leverage > Gluten's performance optimizations. Currently, Spark lacks an official > cross-platform execution support for physical plans. Gluten's mechanism, > which employs the Substrait standard, can convert and optimize Spark's > physical plans, thus improving portability, interoperability, and execution > efficiency. > The design proposal advocates for the incorporation of the TransformSupport > interface and its specialized variants—LeafTransformSupport, > UnaryTransformSupport, and BinaryTransformSupport. These are instrumental in > streamlining the conversion of different operator types into a > Substrait-based common format. The validation phase entails a thorough > assessment of the Substrait plan against native backends to ensure > compatibility. In instances where validation does not succeed, Spark's native > operators will be deployed, with requisite transformations to adapt data > formats accordingly. The proposal emphasizes the centrality of the plan > transformation phase, positing it as the foundational step. The subsequent > validation and fallback procedures are slated for consideration upon the > successful establishment of the initial phase. > The integration of Gluten into Spark has already shown significant > performance improvements with ClickHouse and Velox backends and has been > successfully deployed in production by several customers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-47773) Enhancing the Flexibility of Spark's Physical Plan to Enable Execution on Various Native Engines
[ https://issues.apache.org/jira/browse/SPARK-47773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839497#comment-17839497 ] Ke Jia edited comment on SPARK-47773 at 4/22/24 6:22 AM: - We have refined the above [SPIP |[https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing]]in accordance with the specifications from the Spark community. The latest version of the SPIP is now available [here|https://docs.google.com/document/d/1oY26KtqXoJJNHbAhtmVgaXSVt6NlO6t1iWYEuGvCc1s/edit?usp=sharing]. Welcome and value your suggestions and comments. was (Author: jk_self): We have refined the above [SPIP|[https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing]] in accordance with the specifications from the Spark community. The latest version of the SPIP is now available [here|https://docs.google.com/document/d/1oY26KtqXoJJNHbAhtmVgaXSVt6NlO6t1iWYEuGvCc1s/edit?usp=sharing]. Welcome and value your suggestions and comments. > Enhancing the Flexibility of Spark's Physical Plan to Enable Execution on > Various Native Engines > > > Key: SPARK-47773 > URL: https://issues.apache.org/jira/browse/SPARK-47773 > Project: Spark > Issue Type: Epic > Components: SQL >Affects Versions: 4.0.0 >Reporter: Ke Jia >Priority: Major > > SPIP doc: > https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing > This > [SPIP|https://docs.google.com/document/d/1v7sndtIHIBdzc4YvLPI8InXxhI7SnnAQ5HvmM2DGjVE/edit?usp=sharing] > outlines the integration of Gluten's physical plan conversion, validation, > and fallback framework into Apache Spark. The goal is to enhance Spark's > flexibility and robustness in executing physical plans and to leverage > Gluten's performance optimizations. Currently, Spark lacks an official > cross-platform execution support for physical plans. Gluten's mechanism, > which employs the Substrait standard, can convert and optimize Spark's > physical plans, thus improving portability, interoperability, and execution > efficiency. > The design proposal advocates for the incorporation of the TransformSupport > interface and its specialized variants—LeafTransformSupport, > UnaryTransformSupport, and BinaryTransformSupport. These are instrumental in > streamlining the conversion of different operator types into a > Substrait-based common format. The validation phase entails a thorough > assessment of the Substrait plan against native backends to ensure > compatibility. In instances where validation does not succeed, Spark's native > operators will be deployed, with requisite transformations to adapt data > formats accordingly. The proposal emphasizes the centrality of the plan > transformation phase, positing it as the foundational step. The subsequent > validation and fallback procedures are slated for consideration upon the > successful establishment of the initial phase. > The integration of Gluten into Spark has already shown significant > performance improvements with ClickHouse and Velox backends and has been > successfully deployed in production by several customers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org