[
https://issues.apache.org/jira/browse/BEAM-12135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ismaël Mejía updated BEAM-12135:
--------------------------------
Description: Spark Runner and maybe all other runners that deal with batch
only data might benefit of a batch optimized translation where details about
the full Beam model matter less because we are in Global window, no panes info
is needed and all records use the sane (min) timestamp. With this premise the
records can be encoded as 'value only' WindowValues and transforms like
GroupByKey may ignore windowing (GABW) to improve performance. (was: Spark
Runner and maybe all other runners that deal with batch only data might benefit
of a batch optimized translation where details about the full Beam model matter
less because we are in Global window, no panes info is needed and the we are in
the min timestamp. With this premise the records can be encoded as 'value only'
WindowValues and transforms like GroupByKey may ignore windowing (GABW) to
improve performance.)
> Batch optimized translation for Spark Runner
> --------------------------------------------
>
> Key: BEAM-12135
> URL: https://issues.apache.org/jira/browse/BEAM-12135
> Project: Beam
> Issue Type: Improvement
> Components: runner-spark
> Reporter: Ismaël Mejía
> Priority: P2
>
> Spark Runner and maybe all other runners that deal with batch only data might
> benefit of a batch optimized translation where details about the full Beam
> model matter less because we are in Global window, no panes info is needed
> and all records use the sane (min) timestamp. With this premise the records
> can be encoded as 'value only' WindowValues and transforms like GroupByKey
> may ignore windowing (GABW) to improve performance.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)