GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/18600

    [SPARK-17701][SQL] Refactor RowDataSourceScanExec so its sameResult call 
does not compare strings

    ## What changes were proposed in this pull request?
    
    Currently, `RowDataSourceScanExec` and `FileSourceScanExec` rely on a 
"metadata" string map to implement equality comparison, since the RDDs they 
depend on cannot be directly compared. This has resulted in a number of 
correctness bugs around exchange reuse, e.g. SPARK-17673 and SPARK-16818.
    
    To make these comparisons less brittle, we should refactor these classes to 
compare constructor parameters directly instead of relying on the metadata map.
    
    This PR refactors `RowDataSourceScanExec`, `FileSourceScanExec` will be 
fixed in the follow-up PR.
    
    ## How was this patch tested?
    
    existing tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark minor

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18600.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18600
    
----
commit 5008eb65fb89741196c30da26e189fc046ea0af1
Author: Wenchen Fan <[email protected]>
Date:   2017-07-11T13:47:56Z

    Refactor DataSourceScanExec so its sameResult call does not compare strings

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to