[GitHub] spark pull request: [SPARK-7855] [WIP] Move bypassMergeSort-handli...

JoshRosen Mon, 25 May 2015 15:42:07 -0700

Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/6397#issuecomment-105329288
  
    Alright, I think that this is ready for review so I'm going to remove the 
`WIP` tag.  I'd like feedback on testing: I think that the tests here, combined 
with the implicit testing when this path is used during other shuffle write 
paths, should be sufficient to convince yourself that I haven't broken anything 
seriously when porting the bypass code into its own file.  It's worth calling 
out what I _don't_ have good unit tests for, though: we don't have any tests 
which explicitly try to read per-partition output from the combined output 
file.  AFAIK the `ExternalSorterSuite` didn't explicitly test this, either.  I 
might end up adding a separate test utility for emulating how 
IndexShuffleBlockResolver will read chunks of the output file and index file in 
order to serve shuffles.  This is part of an effort to avoid writing new 
end-to-end tests for testing basic shuffle functionality, since those tend to 
be slow and aren't great for testing things like metrics.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-7855] [WIP] Move bypassMergeSort-handli...

Reply via email to