GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/15996

    [SPARK-18567][SQL][WIP] Simplify CreateDataSourceTableAsSelectCommand

    ## What changes were proposed in this pull request?
    
    The `CreateDataSourceTableAsSelectCommand` is quite complex now, as it has 
a lot of work to do if the table already exists:
    
    1. throw exception if we don't want to ignore it.
    2. do some check and adjust the schema if we want to append data.
    3. drop the table and create it again if we want to overwrite.
    
    The work 2 and 3 are required by `DataFrameWriter` only, I think it's more 
reasonable to put them in `DataFrameWriter`, to simplify 
`CreateDataSourceTableAsSelectCommand`. Then `saveAsTable` can work with hive 
table in append mode.
    
    Behaviour changes:
    
    1. Previously we will throw an exception if the provider given by 
`DataFrameWriter` doesn't match the provider of the existing provider. This is 
annoying because `DataFrameWriter` use parquet provider by default, and users 
have to specify the provider of the table they want to append data to. After 
this PR, we will simply ignore the provider while appending data to existing 
tables.(we can back the the previously behaviour if you think it makes sense)
    
    
    
    ## How was this patch tested?
    
    (Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
    (If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark append

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/15996.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #15996
    
----
commit f52b364d448951cd73ddc8274957181166ddcfe9
Author: Wenchen Fan <[email protected]>
Date:   2016-11-23T16:37:04Z

    remove OverwriteOptions

commit 7f90a100d8122531a3f668a0bf442883f92f98e0
Author: Wenchen Fan <[email protected]>
Date:   2016-11-23T17:41:35Z

    tmp

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to