Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2524#discussion_r20911158
  
    --- Diff: docs/programming-guide.md ---
    @@ -1228,6 +1228,11 @@ interface to accumulate data where the resulting 
type is not the same as the ele
     a list by collecting together elements), and the 
`SparkContext.accumulableCollection` method for accumulating
     common Scala collection types.
     
    +<b>Only when the accumulator operation is executed within an 
    +action</b>, Spark guarantees that the operation will only be applied when 
the task is successfully finished for 
    +the first time, i.e. the restarted task will not update the value. In 
transformations, users should be aware of that 
    +the accumulator value would be updated as long as the task is executed.
    --- End diff --
    
    Thanks for adding this, but it's better to tweak the wording a bit like 
this:
    ```
    For accumulator updates performed inside <b>actions only</b>, Spark 
guarantees that each task's update to the accumulator will only be applied 
once, i.e. restarted tasks will not update the value. In transformations, users 
should be aware of that each task's update may be applied more than once if 
tasks or job stages are re-executed.
    ```
    
    Also, move this paragraph below the language-specific `<div>`s in the text. 
Right now it's only going to show up in the Scala version of the docs. Note 
that there are `<div>`s below this for Java and Python.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to