Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/6397#issuecomment-106606801
  
    In case reviewers missed this in the updated PR description, here's a 
better summary of the new changes:
    
    This patch also makes several improvements to shuffle-related tests and 
adds more defensive checks to certain shuffle classes:
    
    - DiskBlockObjectWriter now throws an exception if `fileSegment()` is 
called before `commitAndClose()` has been called.
    - DiskBlockObjectWriter's close methods are now idempotent, so calling any 
of the close methods twice in a row will no longer result in incorrect shuffle 
write metrics changes.  Calling `revertPartialWritesAndClose()` on a closed 
DiskBlockObjectWriter now has no effect (before, it might mess up the metrics).
    - The end-to-end shuffle record count metrics tests have been moved from 
InputOutputMetricsSuite to ShuffleSuite.  This means that these tests will now 
be run against all shuffle implementations rather than just the default shuffle 
configuration.
    - The end-to-end metrics tests now include a test of a job which performs 
aggregation in the shuffle.
    - Our tests now check that `shuffleBytesWritten == totalShuffleBytesRead`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to