Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/20675
> it just means that for very long-running streams task restarts will
eventually run out.
Ah, I know your means. Yeah, if we support task level retry we should also
set the task retry number unlimited.
> But if you're worried that the current implementation of task restart
will become incorrect as more complex scenarios are supported, I'd definitely
lean towards deferring it until continuous processing is more feature-complete.
Yep, the "complex scenarios" I mentioned mainly including shuffle and
aggregation scenario like comments in
https://issues.apache.org/jira/browse/SPARK-20928?focusedCommentId=16245556&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16245556,
in those scenario maybe task level retry should consider epoch align, but
current implementation of task restart is completed for map-only continuous
processing I think.
Agree with you about deferring it, so I just leave a comment in SPARK-23033
and close this or you think this should reviewed by others?
> Do you want to spin that off into a separate PR? (I can handle it
otherwise.)
Of cause, #20689 added a new interface `ContinuousDataReaderFactory` as our
comments.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]