Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21825#discussion_r206546611
  
    --- Diff: docs/configuration.md ---
    @@ -1215,6 +1215,14 @@ Apart from these, the following properties are also 
available, and may be useful
         if it is too small, <code>BlockManager</code> might take a performance 
hit.
       </td>
     </tr>
    +<tr>
    +  <td><code>spark.broadcast.checksum</code></td>
    +  <td>true</td>
    +  <td>
    +    Whether to enable checksum for broadcast.If it is enabled (default), 
the broadcast will be more reliable.
    --- End diff --
    
    Nits like: space after a period, and the default is already documented 
above. I think this could still be more useful. What about: "If enabled, 
broadcasts will include a checksum, which can help detect corrupted blocks, at 
the cost of computing and sending a little more data. It's possible to disable 
it if the network has other mechanisms to guarantee data won't be corrupted 
during broadcast."
    
    CC @davies . I guess even I'm not sure when I would disable this ... what 
would a network have to guarantee to avoid whatever corruption is possible 
here? Here it isn't clear yet when it's safe, when it won't lead to correctness 
issues.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to