[GitHub] [beam] damccorm opened a new issue, #19989: Optimize translation when Schema information is available in Spark Structured Streaming runner

GitBox Sat, 04 Jun 2022 07:41:44 -0700


damccorm opened a new issue, #19989:
URL: https://github.com/apache/beam/issues/19989


   Spark Structured Streaming runner supports Datasets that already have Schema 
information. This is used by Spark to optimize jobs (via Catalyst). This issue 
is to implement optimized translations of the transforms for the runner so we 
can benefit of the performance improvements internally done by Spark.
   
   Notice that we also may need to map Beam's core internal representations 
like WindowedValue so we can have intermediary optimizations.
   
   Imported from Jira 
[BEAM-9451](https://issues.apache.org/jira/browse/BEAM-9451). Original Jira may 
contain additional context.
   Reported by: iemejia.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] damccorm opened a new issue, #19989: Optimize translation when Schema information is available in Spark Structured Streaming runner

Reply via email to