[GitHub] spark pull request: [SPARK-8014] [SQL] Avoid premature metadata di...

liancheng Tue, 02 Jun 2015 05:42:00 -0700

GitHub user liancheng opened a pull request:

    https://github.com/apache/spark/pull/6583


    [SPARK-8014] [SQL] Avoid premature metadata discovery when writing a 
HadoopFsRelation with a save mode other than Append

    The current code references the schema of the DataFrame to be written 
before checking save mode. This triggers expensive metadata discovery 
prematurely. For save mode other than `Append`, this metadata discovery is 
useless since we either ignore the result (for `Ignore` and `ErrorIfExists`) or 
delete existing files (for `Overwrite`) later.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liancheng/spark spark-8014

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/6583.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #6583
    
----
commit 8fbd93fe485d9d212944f340933235226e67ee82
Author: Cheng Lian <[email protected]>
Date:   2015-06-02T12:35:26Z

    Fixes SPARK-8014

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-8014] [SQL] Avoid premature metadata di...

Reply via email to