[
https://issues.apache.org/jira/browse/SPARK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470067#comment-16470067
]
Li Yuanjian commented on SPARK-24036:
-------------------------------------
I agree with the division about the kinds of tasks, that's quite clear, but
maybe all of this can be maximum transparent to scheduler by reusing the
ResultTask and ShuffleMapTask design, could the DAGScheduler use
ContinuousShuffleMapTask to replace original ShuffleMapTask?
{quote}Changing DAGScheduler to accommodate continuous processing would create
significant additional complexity I don't think we can really justify.
{quote}
So here, in my opinion, maybe not as complex as we think? If I'm wrong please
let me know. :)
{quote}Whether we need to write an explicit shuffle RDD class or not would I
think come down to an implementation detail of SPARK-24236. It depends on
what's the cleanest way to unfold the SparkPlan tree.
{quote}
Yep, can't agree more. I'll arrange this part of our internal code and give a
preview PR. We'll appreciate very much with your any opinions!
> Stateful operators in continuous processing
> -------------------------------------------
>
> Key: SPARK-24036
> URL: https://issues.apache.org/jira/browse/SPARK-24036
> Project: Spark
> Issue Type: Improvement
> Components: Structured Streaming
> Affects Versions: 2.4.0
> Reporter: Jose Torres
> Priority: Major
>
> The first iteration of continuous processing in Spark 2.3 does not work with
> stateful operators.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]