[GitHub] spark issue #20675: [SPARK-23033][SS][Follow Up] Task level retry for contin...

xuanyuanking Mon, 26 Feb 2018 22:49:39 -0800

Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/20675
  
    Great thanks for your detailed reply!
    > The semantics aren't quite right. Task-level retry can happen a fixed 
number of times for the lifetime of the task, which is the lifetime of the 
query - even if it runs for days after, the attempt number will never be reset.
    - I think the attempt number never be reset is not a problem, as long as 
the task start with right epoch and offset. Maybe I don't understand the 
meaning of the semantics, could you please give more explain?
    - As far as I'm concerned, while we have a larger parallel number, whole 
stage restart is a too heavy operation and will lead a data shaking.
    - Also want to leave a further thinking, after CP support shuffle and more 
complex scenario, task level retry need more work to do in order to ensure data 
is correct. But it maybe still a useful feature? I just want to leave this 
patch and initiate a discussion about this :)



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20675: [SPARK-23033][SS][Follow Up] Task level retry for contin...

Reply via email to