[ 
https://issues.apache.org/jira/browse/SPARK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470067#comment-16470067
 ] 

Li Yuanjian commented on SPARK-24036:
-------------------------------------

I agree with the division about the kinds of tasks, that's quite clear, but 
maybe all of this can be maximum transparent to scheduler by reusing the 
ResultTask and ShuffleMapTask design, could the DAGScheduler use 
ContinuousShuffleMapTask to replace original ShuffleMapTask?
{quote}Changing DAGScheduler to accommodate continuous processing would create 
significant additional complexity I don't think we can really justify.
{quote}
So here, in my opinion, maybe not as complex as we think? If I'm wrong please 
let me know. :)
{quote}Whether we need to write an explicit shuffle RDD class or not would I 
think come down to an implementation detail of SPARK-24236. It depends on 
what's the cleanest way to unfold the SparkPlan tree.
{quote}
 Yep, can't agree more. I'll arrange this part of our internal code and give a 
preview PR. We'll appreciate very much with your any opinions!

> Stateful operators in continuous processing
> -------------------------------------------
>
>                 Key: SPARK-24036
>                 URL: https://issues.apache.org/jira/browse/SPARK-24036
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 2.4.0
>            Reporter: Jose Torres
>            Priority: Major
>
> The first iteration of continuous processing in Spark 2.3 does not work with 
> stateful operators.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to